A core problem in statistics and machine learning is to approximate difficult-to-compute probability distributions. This problem is especially important in Bayesian statistics, which frames all inference about unknown quantities as a calculation about a conditional distribution. In this talk I review and discuss innovations in variational inference (VI), a method that approximates probability distributions through optimization. VI has been used in myriad applications in machine learning and Bayesian statistics. It tends to be faster than more traditional methods, such as Markov chain Monte Carlo sampling.
After quickly reviewing the basics, I will discuss two lines of research in VI. I first describe stochastic variational inference, an approximate inference algorithm for handling massive datasets, and demonstrate its application to probabilistic topic models of millions of articles. Then I discuss black box variational inference, a more generic algorithm for approximating the posterior. Black box inference applies to many models but requires minimal mathematical work to implement. I will demonstrate black box inference on deep exponential families—a method for Bayesian deep learning—and describe how it enables powerful tools for probabilistic programming. Finally, I will highlight some more recent results in variational inference, including statistical theory, score-based objective functions, and interpolating between mean-field and fully dependent variational families.