Main content start

Searching for local associations while controlling the false discovery rate

Date
Tue September 30th 2025, 4:00pm
Location
CoDa E160
Speaker
Chiara Sabatti, Stanford Statistics and BDS

In this talk I will describe local conditional hypotheses that express how the relation between explanatory variables and outcomes changes across different contexts, described by covariates. I will then introduce efficient testing strategies for these hypotheses.

The motivation for this work comes from genetics and genomics. For example, as the evidence obtained from genome-wide association studies accumulates, it has become apparent that some genetic variants carry information on phenotypes in some populations and not in others. There are multiple explanations contributing to this phenomenon. Among others, it is possible that some genetic variations might be relevant for the trait of interest only in specific environmental conditions, the exposure to which varies across human populations.

To identify the combination of explanatory variables and covariates that influence an outcome, we build upon the knockoff framework for FDR control and powerful pre-screening strategies. Specifically, the method we propose can leverage any model for the identification of data-driven hypotheses pertaining to different contexts. Then it rigorously tests these hypotheses without succumbing to selection bias. The approach is efficient and does not require sample splitting. We demonstrate the effectiveness of our method through numerical experiments and by studying the genetic architecture of waist/hip ratio across different sexes in the UK Biobank.

This work is in collaboration with Matteo Sesia and Paula Gablenz.