Architecture and loss function design for optimized accuracy and fairness
Hyperparameter optimization is a critical component of designing high-quality machine learning models. In this talk, we formulate the design of neural architectures and loss functions as a differentiable hyperparameter optimization task. This leads to a bilevel problem where one optimizes the model weights over the training data and the hyperparameters over the validation data. We first provide statistical and algorithmic guarantees for the neural architecture search (NAS) problem and quantify the benefits of using a train-validation split. This is accomplished by relating NAS to the design of optimized kernel functions and establishing connections to low-rank matrix estimation problems. We will discuss our recent findings on principled loss function design for imbalanced classification tasks. We introduce a family of parametric cross-entropy functions that allow for multiplicative and additive logit adjustments and discuss how these act as a catalyst to promote fairness. We then propose designing loss functions to optimize fairness-seeking objectives over validation data which leads to state-of-the-art performance for imbalanced classification as well as practical insights on design principles.