Main content start

Brian Trippe

Assistant Professor of Statistics
Joint Appointment or Affiliation:
Data Science
Brian Trippe

I develop statistical machine learning methods to solve challenges that arise in biotechnology and medicine. For example, a method I have developed can generate new molecular structures that could be used in a vaccine in response to an emerging pathogen. This task is analogous to generating new images or answers to questions in response to a written prompt, tasks at which machine learning can excel. But machine learning methods for text and images fall short in biotechnology applications. For one, biotechnology applications involve distinct types of data, such as three-dimensional molecular structures and DNA sequences, for which internet-scale datasets are not available. Additionally, biotechnology applications demand greater reliability; when designing molecular structures, for example, hard physical constraints must be satisfied. My research develops and applies machine learning methods to address these challenges.

The methods I develop are probabilistic to allow data-efficiency by incorporating prior beliefs and sharing of information across related datasets. But taking a probabilistic approach introduces methodological choices across modeling and inference, so I develop methods that provide theoretical guarantees to guide their use and assure their reliability in practice. Through my work on computational protein design over the past two years, my collaborators and I have synthesized hundreds of new molecules that we have subsequently validated in laboratory experiments. This work builds on expertise I have acquired while developing probabilistic methodology for high-dimensional problems that arise in genomics.

My future work will develop statistical machine learning methods to enable biotechnology solutions to challenges ranging from disease eradication to climate change-robust agriculture. While such solutions may include engineered proteins as building blocks, new methods are necessary to reliably combine these building blocks. Two directions I plan to pursue: first, extending my confidence-value method to provide more broadly applicable assessments of the accuracy of inferences from probabilistic models; second, improving protein engineering by bridging statistical, evolutionary, and thermodynamic models. Both of these support a long-term goal to build statistical foundations for data-driven genetic engineering.

Related News

A new appointment will join our department on July 1 (jointly supported by Stanford Data Science). Brian comes to us from a postdoctoral stint with the Columbia University Department of Statistics as well as a visiting researcher post with the University of Washington's Institute for Protein Design in Seattle.