Statistics Data Science Curriculum
This focused MS track is developed within the structure of the current MS in Statistics and new trends in data science and analytics. Upon the successful completion of the Data Science MS degree students will be prepared to continue on to related doctoral program or as a data science professional in industry. Completing the MS degree is not a direct path for admission to the PhD program in Statistics.
This program is not an online degree program.
Coursework
The Data Science track develops strong mathematical, statistical, computational and programming skills, in addition to providing fundamental data science education through general and focused electives requirement from courses in data sciences and other areas of interest.
As defined in the general Graduate Student Requirements, students have to maintain a grade point average (GPA) of 3.0 or better and classes must be taken at the 200 level or higher. Students satisfying the course requirements of the Data Science track do not satisfy the other course requirements for the M.S. in Statistics
The total number of units in the degree is 45, 36 of which must be taken for a letter grade.
Submission of approved Master's Program Proposal, signed by the master's advisor, to the student services officer by the end of the first quarter of the master's degree program. A revised program proposal is required to be filed whenever there are changes to a student's previously approved program proposal.
There is no thesis requirement.
Data Science Proposal Forms
Students must demonstrate breadth of knowledge in the field by completing courses in these core areas.
- Mathematical & Statistical Foundations (15 units)
- Experimentation (3 units)
- Scientific Computing (includes software development & large-scale computing) (6 units minimum)
- Machine Learning Methods & Applications (6 units minimum)
- Practical Component (3 units)
- Elective course in the data sciences (remainder of 45 units)
Mathematical and Statistical Foundations (15 units)
Students must demonstrate foundational knowledge in the field by completing the following courses. Courses in this area must be taken for letter grades.
Introduction to Statistical Inference (STATS 200)
OR
Theory of Statistics I (STATS 300A)
Finite sample optimality of statistical procedures; Decision theory: loss, risk, admissibility; Principles of data reduction: sufficiency, ancillarity, completeness; Statistical models: exponential families, group families, nonparametric families; Point estimation: optimal unbiased and equivariant estimation, Bayes estimation, minimax estimation; Hypothesis testing and confidence intervals: uniformly most powerful tests, uniformly most accurate confidence intervals, optimal unbiased and invariant tests.
Prerequisites: Real analysis, introductory probability (at the level of STATS 116), and introductory statistics.
Introduction to Regression Models and Analysis of Variance (STATS 203)
Modeling and interpretation of observational and experimental data using linear and nonlinear regression methods. Model building and selection methods. Multivariable analysis. Fixed and random effects models. Experimental design. Prerequisites: A post-calculus introductory probability course, e.g. STATS 116, basic computer programming knowledge, some familiarity with matrix algebra, and a pre- or co-requisite post-calculus mathematical statistics course, e.g. STATS 200.
Or STATS 203V (Su)
OR
Applied Statistics I (STATS 305A)
Modern Applied Statistics: Learning (STATS 315A)
OR
Machine Learning (STATS/CS 229)
Numerical Linear Algebra (CME 302)
Stochastic Methods in Engineering (CME 308)
Experimentation Elective (3 units)
Experimental method and causal considerations are fundamental to data science. The course chosen from this area must be taken for letter grades.
Courses in this area must be taken for letter grades.
Introduction to Causal Inference (STATS 209)
Design of Experiments (STATS 263/363)
Software Development and Scientific Computing (6 units minimum)
2022-2023 - CME 212 will not be offered. See instructions below.
To ensure that students have a strong foundation in programming, 3 units of software development (CME212) and minimum 3 units of scientific computing.
- Students who do not start the program with a strong computational and/or programming background will take an extra 3 units to prepare themselves by taking CME211* Programming in C/C++ for Scientists and Engineers, or equivalent course with advisor's approval.
- Summer placement exam for CME 212 will be sent to matriculating students in July. Students who pass this placement test are not required to take CME 211, and may replace the class with an elective.
Courses in this area must be taken for letter grades.
Software Development: (3 units)
In lieu of CME 212 (not offered 2022-23), students make take an additional 3-units from the list of Scientific Computing.
Advanced Software Development for Scientists and Engineers (CME 212)
Scientific Computing: (3–6 units)
Introduction to parallel computing using MPI, openMP, and CUDA (CME 213)
Discrete Mathematics and Algorithms (CME 305)
Optimization (CME 307)
Distributed Algorithms and Optimization (CME 323)
Convex Optimization I (CME 346A)
Mining Massive Data Sets (CS 246)
Principles of Data-Intensive Systems (CS 245)
Machine Learning Methods & Applications (6–9 units minimum)
Courses in this area must be taken for letter grades. Courses outside this list are subject to approval.
Modern Applied Statistics: Data Mining (STATS 315B)
Artificial Intelligence: Principles and Techniques (CS 221)
Natural Language Processing with Deep Learning (CS 224N)
Deep Learning (CS 230)
Deep Learning for Computer Vision (CS 231N)
Reinforcement Learning (CS 234)
Deep Generative Models (CS 236)
Practical Component of Capstone project
Students are required to take minimum of 3 units of practical component that may include any combination of:
A capstone project, supervised by a faculty member and approved by the student's advisor. The capstone project should be computational in nature. Students should submit a one- page proposal, supported by the faculty member and sent to the student's Data Science advisor for approval (at least one quarter prior to start of project).
- Master's Research: STATS 299 Independent Study. In consultation with your advisor, independent study/directed reading with permission of statistics faculty. (repeatable).
- BIODS 232: Consulting Workshop on Biomedical Data Science Units: 1–2 units
- Gain practical industry experience and exposure to the organization, its industry, and the space in which it operates, Build relationships in the organization and industry, and gain an understanding of related career paths. ALP 301 Data-Driven Impact
- AI for Health Care bootcamp
- Two quarter commitment (research credit)
- Students with a background in artificial intelligence, software engineering or medicine are encouraged to apply. The bootcamp is suited for students who have taken machine learning and software engineering courses.
- AI for Climate Change bootcamp
- Two quarter commitment (research credit)
- Students with a background in artificial intelligence, software engineering or medicine are encouraged to apply. This role is suited for students who have taken machine learning and software engineering courses.
- Analytics Accelerator (CME 217) Units: 3 | Repeatable 2 times (up to 6 units total) >> REPLACED by Xplore Projects with ICME (CME 291)
- multidisciplinary graduate level course offering real-world project-based research. Students work in dynamic teams with the support of course faculty, and outside analytics experts to scope and research projects, apply a computational and data analytics lens and follow design thinking methodology.
- Enrollment by application only.
- Data Driven Impact (ALP 301) [when offered]
- Xplore Projects (CME 291) Units: 3 | Repeatable 2 times (up to 6 units total)
- Autumn projects include:
-
IDM Gates Foundation Disease Eradication
-
Lawrence Berkeley
-
Mathworks Speech/Human Motion
-
Multimodal Emotion Classification
-
Sandia Global Climate
-
Stanford ML in Genomics
-
Stanford COVID Lung Imaging
-
The Ocean Cleanup Beach Analysis
-
World Bank Health Systems
-
- Autumn projects include:
- Other courses that have a strong hands-on and practical component, such as STATS 390 Consulting Workshop (repeatable).
- This class requires mastery of Statistics at the (graduate) level necessary to provide consultation to fellow members of the university.
- Students attend weekly lectures on Friday to discuss consulting cases and various statistical techniques that arise frequently in consulting.
Data Science Electives (6–9 units)
In consultation with the student's program advisor, the student selects courses within the realm of data science to fulfill the remaining coursework required for the degree.
Minimum 6 units of elective coursework.
The following courses may also be taken for elective credit: