Students will engage in interdisciplinary research using statistical methods for data mining, causal inference, machine learning within the following disciplines: biological sciences, computational statistics or statistical learning/optimization.
Application Deadline: February 21, 2020 (Friday)
Project: We will do some reading and research about multivariate data, looking at visualization and estimation problems. We will also write a research paper about a related problem, where the student will participate e.g. by writing some code.
Project: We are interested in developing a framework for measuring how a behavioral-intervention changes how people think. Probably the most natural way to do this would be to ask people to respond to a situation, having them talk about it and think-aloud about what they would do. Our group is developing new methods for taking in free-text and doing rigorous causal inference (that is, figuring out how much something changed due to an intervention). A good candidate would know something about things like regression, but a great candidate will want to think very carefully about how behavior changes and how to measure that change. We work on educational intervention (e.g., reducing imposter-syndrome) and violence prevention programs (both here at Stanford and also in the slums of Nairobi, Kenya).
Project: This project will analyze ranked choice voting through the lens of recent efforts to model contextual decision making. How do the candidates on a ballot potentially create a “context” within which citizens make a sequence of ranked choices? How do different ways to model context differ in their ability to capture behavior in empirical data? A good candidate will have taken coursework in machine learning and optimization, be proficient in data wrangling, and have an interest in behavioral models of discrete choice and/or ranking.
Project: We will explore different model-based optimization algorithms for summarizing posterior distributions. We will do some reading and research on estimation of distribution algorithms and apply them to problems in cancer and antibody repertoire evolution. We will write a research paper with members of the lab.
Lead Researcher: Shinnosuke Nakayama
Project: Illegal, unreported and unregulated fishing (IUU) contributes 10-30% of seafood in the market, jeopardizing livelihood of 3 billion people who rely on fisheries while aggravating modern slavery problems. We have started understanding fishing activities through automated identification system (AIS), which provides locations of fishing vessels at high frequencies. However, many fishing vessels are undetectable — they can “go dark” by turning off the AIS device, and small fishing boats are not required to carry the device. Toward painting a comprehensive picture of the IUU landscape, we aim to characterize activities of fishing vessels off the radar using satellite imagery. The project involves analysis of port usage by small vessels and characterization of dark vessel behavior through image analysis in combination with AIS data. See project details
Project: This project involves statistical analysis of millions of single-cell RNA sequencing profiles of human and mouse lemur cells. Goals include uncovering evolutionary divergence and conservation of tissue function and regulation across primates and identification of disease-relevant pathways
First Project: Parameter-estimation in non-linear fishery models
We are looking for a highly motivated student who will engage in parameter estimation for non-linear fishery models. Specifically, we have gathered landing catch from the abalone fishery in Isla Natividad for 6 fishing zones over 17 years, we developed a size-structured integral projection model (IPM) (programmed in R) and we need to estimate unknown parameters such as the strength of density-dependence and catchability. The students will develop and run the scripts to implement a number of estimators using (i) classic Maximum Likelihood, (ii) a Bayesian approach (possibly with Stan) as well as (iii) particle filtering (POMP package)
Lead Researcher: Richard Grewelle, 4th year Ph.D. student in Biology
Second Project: A theoretical approach in estimating the number of genes in a polygenic trait
Many genetic traits are regulated by multiple genes. There is a continuum from single gene traits to quantitative traits, where genes or non-gene elements contribute infinitesimally to the resulting phenotype (trait). It is of great interest to many biologists to determine, through experiment, the number of separate genetic elements contributing to a phenotype in an organism or population. Often diseases are regulated by multiple genes. There is one approach used that was developed decades ago to statistically determine this number. However, its use requires great effort experimentally. A PhD candidate, Richard Grewelle, has developed an alternative approach that requires less experimental effort. The statistical framework needs further development before it can be broadly applied. A prospective summer intern should have an interest in developing mathematical or statistical approaches and have an upper level undergraduate to graduate level understanding of statistics or mathematics. Some computation is required, but most efforts will involve theory. Programming proficiency is a bonus but not necessary.
Project: Our research team has multiple projects involving population level and Single cell datasets (Transcriptomic, epigenetic and proteomic) regarding stem and progenitor cell function in health and musculoskeletal diseases. The aims for the potential summer projects would be to (a) develop statistical approaches to discern specific population subsets from the bulk population data to analyze how different cell populations change during disease pathogenesis, (b) optimize tools to stratify patients and (c) to correlate data from multiple tissues and develop predictive models. Please feel free to reach out to discuss in detail.
Project: Do sharks have friends? Using Social Network Graphs to Identify Patterns in Shark Aggregations
Recent advances in animal tagging and marine fish observation has resulted in new efforts to study the social behavior of sharks. New observations have shown that many species of sharks can form large aggregations at different times during their life history. For some species these aggregations are temporary, but for others, they are more persistent. One of those species for which aggregations appear to persist is the population of Sand Tiger sharks (Carcharias taurus), along the Eastern Coast of the US. We have two datasets that contain information about potential aggregations and interactions between individual sharks that can be explored more thoroughly to identify patterns in the networks of interactions. For a student with interest in animal behavior and/or network graphs and analysis, there is much that could be done using simulations to identify non-random associations between sharks, and identifying patterns in aggregations during their annual migratory behavior. The appropriate student would ideally be interested in social network analysis, computer simulations (although not necessary), and have some coding experience or willingness to learn
This program runs for 8 weeks starting in June, 2020.
This research opportunity is for Stanford University undergraduate students only. Learn more about student eligibility.
Students accepted into the program receive a lump sum of $6,500 and are responsible for finding their own housing during the summer. Preference is given to Mathematical and Computational Science majors, however any Stanford undergraduate that meets our prerequisites may apply.
Contact firstname.lastname@example.org if you have any questions.