Main content start

Modern machine learning methods: Large step-size optimization and controlling model complexity implicitly and explicitly

Date
Tue February 24th 2026, 4:00pm
Location
CoDa E160
Speaker
Peter Bartlett, UC Berkeley

The impressive performance of modern machine learning methods seems to arise through different mechanisms from those of classical statistical learning theory, mathematical statistics, and optimization theory. Simple gradient methods find excellent solutions to non-convex optimization problems, and without any explicit effort to control model complexity they exhibit excellent prediction performance in practice. This talk will describe recent progress in statistical learning theory and optimization theory that demonstrates the optimization benefits of step sizes that are too large to allow gradient methods to be viewed as an accurate time discretization of a gradient flow differential equation, that characterizes the solutions that are favored by gradient optimization methods, and that compares the finite-sample performance of early-stopped gradient methods with ridge regularization in the classical setting of linear regression.