From estimation to decisions: Statistical foundations for interactive learning
The prevailing recipe of ever-larger models trained on passively collected data is showing diminishing returns. The next phase of progress will increasingly rely on interactive learning: systems that actively collect data through experimentation, from clinical trials and A/B testing to recommendation systems and scientific discovery. This talk presents a research program developing statistical foundations for interactive learning with modern deep models.
Part I introduces the Decision-Estimation Coefficient, a unifying framework for understanding when interactive learning is statistically tractable, analogous to empirical process theory in supervised learning. This theory directly yields practical, industry-deployed algorithms that transform off-the-shelf estimators into optimal sequential decision-making methods.
Part II addresses the modern paradigm of adapting pre-trained foundation models for sequential decision making. I introduce the "coverage profile" as a key statistical quantity governing post-training success, leading to new interventions that connect pre-training objectives, post-training signals, and downstream performance.