PhD Defence • Algorithms and Complexity • Algorithmic Tools for Network Analysis

Friday, June 20, 2025 1:00 pm - 4:00 pm EDT (GMT -04:00)

Please note: This PhD defence will take place in DC 2314 and online.

Jingbang Chen, PhD candidate
David R. Cheriton School of Computer Science

Supervisors: Professors Lap Chi Lau, Richard Peng

Network analysis is a crucial technique used in various fields such as computer science, telecommunications, transportation, social sciences, and biology. Its importance includes optimizing network performance, understanding social and organizational structures, and detecting fraud or misinformation. In this thesis, we propose algorithmic results on several aspects of network analysis.

The Abelian sandpile model is recognized as the first dynamical system discovered exhibiting self-organized criticality. In sandpiles, network dynamics are modeled as distributing chips. We present algorithms that compute the final state of the sandpile instance on various classes of graphs, solving the sandpile prediction problem. Our results include: (1) An algorithm that works on general graphs, obtaining a $\log N$ dependency on the number of chips $N$, improving over the $\mathtt{poly}(N)$ dependency in pure simulation. Further analyses are provided for regular graphs, expander graphs, and hypercubes. (2) An algorithm for trees that runs in $O(n \log n)$ time, where $n$ is the number of vertices. It can be further modified to an algorithm for paths that runs in $O(n)$ time.

To analyze the structure and dynamics of networks, counting motifs is one of the most popular methods, as they are considered the basic construction block of the network. For example, counting the number of bicliques on bipartite graphs has wide applications ranging from biological ecosystems to social networks, where bipartite networks model relationships between two sets of entities. In this thesis, we introduce several tools developed for counting motifs on bipartite networks.

Counting $(p,q)$-bicliques has attracted much attention since it serves as a fundamental operator in many applications, including cohesive subgraph analysis, information aggregation in graph neural networks, and densest subgraph mining. Despite its importance, counting $(p,q)$-bicliques is very challenging due to its exponential increase with respect to $p$ and $q$. On the other hand, in many applications of biclique counting, an approximate count is often sufficient. We present a new sampling-based method that produces a high-accuracy approximate counting of $(p,q)$-bicliques, with provably error guarantee and unbiasedness.

In another line of work, we consider the temporal setting. Temporal bipartite graphs, in which edges typically carry timestamps, are often considered because real-world interactions (modeled as edges) usually occur at specific timestamps. To capture the dynamic nature of relationships, we consider counting butterflies ($(2,2)$-bicliques) in these temporal graphs within specified time windows, called the \textit{historical butterfly counting} problem. We first present a hardness result between memory usage and query time, showing that $O(n^2/\lambda^2)$ memory is needed for an $O(\lambda)$ runtime per query, where $n$ denotes the number of vertices. Then, we propose a new index algorithm that surpasses the hardness when applied to power-law graphs, with outstanding empirical performance.

Lastly, we discuss tools that find the polarized community in the network. A classical model that applies to networks to deal with polarization is the signed graphs, which have positive and negative edges between vertices. A signed graph is balanced if it can be decomposed into two disjoint sets such that positive edges are between vertices in the same set while negative edges are between vertices from different sets. This notion of balance is strict in that no edge can disobey the condition, which seldom appears in reality. To address this phenomenon, we propose a new model for identifying balanced subgraphs with tolerance in signed graphs and a new heuristic algorithm that computes maximal balanced subgraphs under the new tolerance model.


To attend this PhD defence in person, please go to DC 2314. You can also attend virtually on Zoom.