Logistic regression explicitly maximizes margins: Should we stop training early?

Tue July 13th 2021, 4:30pm
Matus Telgarsky, University of Illinois Urbana-Champaign

This talk will present two perspectives on the behavior of gradient descent with logistic loss: on the one hand, it seems we should run as long as possible, and achieve good margins; on the other, stopping early seems necessary for noisy problems. In the first part, focused on the linear case, a new perspective of explicit bias (rather than implicit bias) yields new analyses and algorithms with margin maximization rate as fast as $1/t^2$ (whereas prior work had $1/sqrt{t}$ at best). The second part, focused on shallow ReLU networks, argues that the margin bias might fail to be ideal, and that stopping early can achieve consistency and calibration for arbitrary classification problems. Moreover, this early phase is still adaptive to data simplicity, but with a different bias than the margin bias.

This is joint work with Ziwei Ji, Justin D. Li, Nati Srebro.

Zoom Recording [SUNet/SSO authentication required]