Beyond the black box: Characterizing and improving how neural networks learn
The predominant paradigm in deep learning practice treats neural networks as "black boxes". This leads to economic and environmental costs as brute-force scaling remains the performance driver, and to safety issues as robust reasoning and alignment remain challenging. My research opens up the neural network black box with mathematical and statistical analyses of how networks learn, and yields engineering insights that improve the efficiency and transparency of these models. In this talk I will present characterizations of (1) how large language models can learn to reason with abstract symbols and (2) how hierarchical structure in data guides deep learning, and will conclude with (3) new tools to distill trained neural networks into lightweight and transparent models.