Scalable community detection in massive networks using aggregated relational data
Fitting large Bayesian network models quickly becomes computationally infeasible when the number of nodes grows into the hundreds of thousands and millions. In particular, the mixed membership stochastic blockmodel (MMSB) is a popular Bayesian network model used for community detection. In this paper, we introduce a scalable inference method that leverages nodal information which often accompanies real-world networks. Conditioning on this extra information leads to a model that admits a parallel variational inference algorithm. We apply our method to a citation network with over two million nodes and 25 million edges.