Transparent and consistent deep representation learning: From black-box to white-box, from open-loop to closed-loop

Thu March 7th 2024, 4:30pm
Pigott 113
Yi Ma, HKU and UC Berkeley

In this talk, we provide a systematic explanation to the practice of deep neural networks in the past decade from the perspective of compressive data encoding and decoding. We argue that the most fundamental objective of learning (or intelligence) is to learn a compact and structured representation of the distribution of sensed data. Goodness of the final learned representation can be measured by a principled quantity known as the information gain, computable by (lossy) coding rates of the learned features. We contend that unrolled iterative optimization of this objective provides a unifying white-box explanation to almost all past and current deep neural networks widely adopted in the practice of artificial intelligence, including ResNets and Transformers. We will show with theoretical and empirical evidence that, mathematically interpretable, practically performant, and semantically meaningful deep networks are now within our reach. Furthermore, our study shows that for the learned representation to be correct and consistent, one needs to close the loop of the encoding and decoding networks, instead of the current practice of training them end-to-end as separated open-loop networks. Probably most importantly, this new framework reveals a much broader and brighter future for developing next-generation autonomous learning systems that could truly emulate the computational mechanisms of natural intelligence, at least the memory.

Related papers are available online:

  1. White-box transformers via sparse rate reduction
  2. ReduNet: A white-box deep network from the principle of maximizing rate reduction
  3. CTRL: Closed-loop transcription to an LDR via minimaxing rate reduction