Please note: This PhD defence will take place in DC 2314 and online.
Vihan Shah, PhD candidate
David R. Cheriton School of Computer Science
Supervisor: Professor Sepehr Assadi
The rise of large-scale datasets across domains such as social networks, biological systems, and the web has made it increasingly important to understand how core graph problems can be solved under tight resource constraints. As these datasets grow, traditional algorithms that assume random access to the input become increasingly infeasible. This thesis explores how to process massive graphs efficiently under modern models of computation that address these limitations. The primary focus is on the streaming model, and this thesis also explores other modern models, including sublinear-time, fully dynamic, and oracle-based models.
The first part of the thesis develops space-optimal algorithms for fundamental graph problems in the streaming setting. We study approximate minimum cut, k-vertex connectivity, maximum matching, minimum vertex cover, and correlation clustering across different streaming models—including insertion-only, dynamic, and random-order streams. By establishing tight upper and lower bounds—often matching up to constant or polylogarithmic factors—these results resolve several open questions in the streaming literature and characterize entire space-approximation trade-offs for some of these problems.
The second part of the thesis expands the exploration to other modern models of computation, including sublinear-time algorithms, fully dynamic algorithms, and oracle-based models, such as learning-augmented algorithms and streaming verification. We begin by studying the 4-cycle counting problem in the fully dynamic model and show an improvement over the previous best algorithm, demonstrating that the previously assumed natural bound was not tight. In the sublinear setting, we examine the problem of estimating matching size and prove strong lower bounds against non-adaptive algorithms. For the maximum independent set problem in the learning-augmented model, we develop a new algorithm that achieves a significantly improved approximation factor in polynomial time. Lastly, we explore the streaming verification model, focusing primarily on connectivity problems. Together, these contributions deepen our understanding of the fundamental limits and possibilities of algorithm design for massive data under constrained computational resources.
To attend this PhD defence in person, please go to DC 2314. You can also attend virtually on Zoom.