Scalable community detection in massive networks using aggregated relational data

Tue May 16th 2023, 4:30pm
Sloan 380C
Tian Zheng, Columbia University

Fitting large Bayesian network models quickly becomes computationally infeasible when the number of nodes grows into the hundreds of thousands and millions. In particular, the mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model used for community detection. In this paper, we introduce a scalable inference method that leverages nodal information which often accompanies real-world networks. Conditioning on this extra information leads to a model that admits a parallel variational inference algorithm. We apply our method to a citation network with over two million nodes and 25 million edges.