Likelihood-based inference for stochastic epidemic models via data augmentation
Due to noisy data and nonlinear dynamics, even simple stochastic epidemic models such as the Susceptible-Infectious-Removed (SIR) present significant challenges to inference. In particular, evaluating the marginal likelihood of such stochastic processes conditioned on observed endpoints a notoriously difficult task. As a result, likelihood-based inference is typically considered out of reach in the presence of missing data, and practitioners often resort to simulation methods or approximations that may bias conclusions and reduce interpretability of estimates. We discuss some recent contributions that enable "exact" inference using the likelihood of observed data, focusing our attention on a perspective that makes use of latent variables to explore configurations of the missing data within a Markov chain Monte Carlo framework. Motivated both by count data from large outbreaks and high-resolution contact data from mobile health studies, we show how our data-augmented approach successfully learns the interpretable epidemic parameters and scales to handle large realistic data settings efficiently.