Machine learning models often include group attributes like sex, age, and HIV status for the sake of personalization, i.e., to assign more accurate predictions to heterogeneous subpopulations. In this talk, I will describe how such practices inadvertently lead to worsenalization, by assigning unnecessarily inaccurate predictions to minority groups. I will discuss how these effects violate our basic expectations from personalization in applications like clinical decision support, and describe how they arise due to standard practices in algorithm development. I will end by highlighting work on how to address these issues in practice–first, by setting "personalization budgets" to test for worsenalization; second, by developing "participatory prediction systems" where individuals can consent to personalization at prediction time.
This is based on joint work with Flavio Calmon, Katherine Heller, Maryzeh Ghassemi, Hailey James, Carol Long, Lucas Monteiro Paes, and Vinith Suriyakumar.