Hidden convexity of deep neural networks: Exact and transparent Lasso formulations via geometric algebra

Tue April 23rd 2024, 4:30pm
Sloan 380Y
Mert Pilanci, Stanford EE

In this talk, we introduce an analysis of deep neural networks through convex optimization and geometric (Clifford) algebra. We begin by introducing exact convex optimization formulations for ReLU neural networks. This approach demonstrates that deep networks can be globally trained through convex programs, offering a globally optimal solution. Our results further establish an equivalent characterization of neural networks as high-dimensional convex Lasso models. These models employ a discrete set of wedge product features and apply sparsity-inducing convex regularization to fit data. This framework provides an intuitive geometric interpretation where the optimal neurons represent signed volumes of parallelotopes formed by data vectors. Specifically, we show that the Lasso dictionary is constructed from a discrete set of wedge products of input samples, with deeper network architectures leading to geometric reflections of these features. This analysis also reveals that standard convolutional neural networks can be globally optimized in fully polynomial time. Numerical simulations validate our claims, illustrating that the proposed convex approach is faster and more reliable than standard local search heuristics, such as stochastic gradient descent and its variants. We show a layerwise convex optimization scheme whose performance is comparable to non-convex end-to-end optimization. We also discuss extensions to batch normalization, generative adversarial networks, transformers and diffusion models.