# Statistics Data Science Curriculum

This focused MS track is developed within the structure of the current MS in Statistics and new trends in data science and analytics. Upon the successful completion of the Data Science MS degree students will be prepared to continue on to related doctoral program or as a data science professional in industry. Completing the MS degree is not a direct path for admission to the PhD program in Statistics.

This program is not an online degree program.

Coursework

The Data Science track develops strong mathematical, statistical, computational and programming skills, in addition to providing fundamental data science education through general and focused electives requirement from courses in data sciences and other areas of interest.

As defined in the general Graduate Student Requirements, students have to maintain a grade point average (GPA) of 3.0 or better and classes must be taken at the 200 level or higher. Students satisfying the course requirements of the Data Science track do not satisfy the other course requirements for the M.S. in Statistics

The total number of units in the degree is 45, 36 of which must be taken for a letter grade.

Submission of approved Master's Program Proposal, signed by the master's advisor, to the student services officer by the end of the first quarter of the master's degree program. A revised program proposal is required to be filed whenever there are changes to a student's previously approved program proposal.

There is no thesis requirement.

### Data Science Proposal Forms

Students must demonstrate breadth of knowledge in the field by completing courses in these core areas.

- Mathematical & Statistical Foundations (15 units)
- Experimentation (3 units)
- Scientific Computing (includes software development & large-scale computing) (6 units minimum)
- Machine Learning Methods & Applications (6 units minimum)
- Practical Component (3 units)
- Elective course in the data sciences (remainder of 45 units)

### Mathematical and Statistical Foundations (15 units)

Students must demonstrate foundational knowledge in the field by completing the following courses. Courses in this area must be taken for letter grades.

##
Introduction to Statistical Inference (STATS 200)

**Prerequisite: STATS 116.**

OR

##
Theory of Statistics I (STATS 300A)

Finite sample optimality of statistical procedures; Decision theory: loss, risk, admissibility; Principles of data reduction: sufficiency, ancillarity, completeness; Statistical models: exponential families, group families, nonparametric families; Point estimation: optimal unbiased and equivariant estimation, Bayes estimation, minimax estimation; Hypothesis testing and confidence intervals: uniformly most powerful tests, uniformly most accurate confidence intervals, optimal unbiased and invariant tests.

**Prerequisites: Real analysis, introductory probability (at the level of STATS 116), and introductory statistics.**

##
Introduction to Regression Models and Analysis of Variance (STATS 203)

Modeling and interpretation of observational and experimental data using linear and nonlinear regression methods. Model building and selection methods. Multivariable analysis. Fixed and random effects models. Experimental design. Prerequisites: A post-calculus introductory probability course, e.g. STATS 116, basic computer programming knowledge, some familiarity with matrix algebra, and a pre- or co-requisite post-calculus mathematical statistics course, e.g. STATS 200.

Or STATS 203V (Su)

OR

##
Applied Statistics I (STATS 305A)

**Terms: Aut | Units: 3**

##
Modern Applied Statistics: Learning (STATS 315A)

**Prerequisites: STATS 305A, 305B, 305C or consent of instructor.**

**Terms: Win | Units: 3**

##
Numerical Linear Algebra (CME 302)

**Terms: Aut | Units: 3**

##
Stochastic Methods in Engineering (CME 308)

**Prerequisites: exposure to probability and background in analysis.**

**Terms: Spr | Units: 3**

### Experimentation Elective (3 units)

Courses in this area must be taken for letter grades.

##
Introduction to Causal Inference (STATS 209)

**Prerequisites: basic probability and statistics, familiarity with R.**

**Terms: Aut | Units: 3**

##
Design of Experiements (STATS 263/363)

**Prerequisites: probability at STATS 116 level or higher, and at least one course in linear models.**

##
Intermediate Econometrics II (ECON 271)

**As a prerequisite, this course assumes working knowledge of probability theory and statistics as covered in Econ 270/ MGTECON 603. Prerequisites: Econ 270/ MGTECON 603 or equivalent.**

**Terms: Win | Units: 3-5**

### Software Development and Scientific Computing (6 units minimum)

To ensure that students have a strong foundation in programming, 3 units of software development (CME212) and minimum 3 units of scientific computing.

- Students who do not start the program with a strong computational and/or programming background will take an extra 3 units to prepare themselves by taking
**CME211* Programming in C/C++ for Scientists and Engineers,**or equivalent course with advisor's approval. - Summer placement exam for CME 212 will be sent to matriculating students in July. Students who pass this placement test are not required to take CME 211, and may replace the class with an elective.

Courses in this area must be taken for letter grades.

**Software Development: (3 units)**

##
Advanced Software Development for Scientists and Engineers (prerequisite: CME 211* )

**Terms: Win | Units: 3**

**Scientific Computing: (3**–**6 units)**

##
Introduction to parallel computing using MPI, openMP, and CUDA (CME 213)

**Pre-requisites include C++, templates, debugging, UNIX, makefile, numerical algorithms (differential equations, linear algebra).**

**Terms: Spr | Units: 3**

##
Discrete Mathematics and Algorithms (CME 305)

**Prerequisites: CS 261 is highly recommended, although not required.**

**Terms: Win | Units: 3**

##
Optimization (CME 307)

**Prerequisites: MATH 113, 115, or equivalent.**

**Terms: Win | Units: 3**

##
Distributed Algorithms and Optimization (CME 323)

**Recommended prerequisites: Discrete math at the level of CS 161 and programming at the level of CS 106A.**

**Terms: Spr | Units: 3**

##
Convex Optimization I (CME 346A)

**Prerequisite: linear algebra such as EE263, basic probability.**

**Terms: Win, Sum | Units: 3**

##
Mining Massive Data Sets (CS 246)

**Prerequisites: At least one of CS107 or CS145.**

**Terms: Win | Units: 3-4**

### Machine Learning Methods & Applications (6–9 units minimum)

Courses in this area must be taken for letter grades. *Courses outside this list are subject to approval.*

##
Modern Applied Statistics: Data Mining (STATS 315B)

**Terms: Spr | Units: 3**

##
Artificial Intelligence: Principles and Techniques (CS 221)

**Prerequisites: CS 103 or CS 103B/X, CS 106B or CS 106X, CS 109, and CS 161 (algorithms, probability, and object-oriented programming in Python). We highly recommend comfort with these concepts before taking the course, as we will be building on them with little review.**

**Terms: Aut, Spr | Units: 3-4**

##
Natural Language Processing with Deep Learning (CS 224N)

**Terms: Win | Units: 3-4**

##
Deep Learning (CS 230)

**Prerequisites: Familiarity with programming in Python and Linear Algebra (matrix / vector multiplications). CS 229 may be taken concurrently.**

**Terms: Aut, Spr | Units: 3-4**

##
Deep Learning for Computer Vision (CS 231N)

**Prerequisites: Proficiency in Python; CS131 and CS229 or equivalents; MATH21 or equivalent, linear algebra.**

**Terms: Spr | Units: 3-4**

##
Reinforcement Learning (CS 234)

**Prerequisites: proficiency in python, CS 229 or equivalents or permission of the instructor; linear algebra, basic probability.**

**Terms: Win | Units: 3**

##
Deep Generative Models (CS 236)

**Prerequisites: Basic knowledge about machine learning from at least one of CS 221, 228, 229 or 230. Students will work with computational and mathematical models and should have a basic knowledge of probabilities and calculus. Proficiency in some programming language, preferably Python, required.**

**Terms: Aut | Units: 3**

### Practical Component of Capstone project

Students are required to take minimum of 3 units of practical component that may include any combination of:

A capstone project, supervised by a faculty member and approved by the student's advisor. The capstone project should be computational in nature. Students should submit a one- page proposal, supported by the faculty member and sent to the student's Data Science advisor for approval (at least one quarter prior to start of project).

- Master's Research: STATS 299 Independent Study. In consultation with your advisor, independent study/directed reading with permission of statistics faculty. (repeatable).
**BIODS 232****: Consulting Workshop on Biomedical Data**Science (1–2 units)- Gain practical industry experience and exposure to the organization, its industry, and the space in which it operates, Build relationships in the organization and industry, and gain an understanding of related career paths.
**ALP 301 Data-Driven Impact** - AI for Health Care bootcamp
- Two quarter commitment (research credit)
- Students with a background in artificial intelligence, software engineering or medicine are encouraged to apply. The bootcamp is suited for students who have taken machine learning and software engineering courses.

- AI for Climate Change bootcamp
- Two quarter commitment (research credit)
- Students with a background in artificial intelligence, software engineering or medicine are encouraged to apply. This role is suited for students who have taken machine learning and software engineering courses.

- Analytics Accelerator (CME 217) Units: 3 | Repeatable 2 times (up to 6 units total)
- multidisciplinary graduate level course offering real-world project-based research. Students work in dynamic teams with the support of course faculty, and outside analytics experts to scope and research projects, apply a computational and data analytics lens and follow design thinking methodology.
- Enrollment by application only.

- Data Driven Impact (ALP 301) [ when offered]
- Other courses that have a strong hands-on and practical component, such as
**STATS 390****Consulting Workshop**(repeatable).- This class requires mastery of Statistics at the (graduate) level necessary to provide consultation to fellow members of the university.
- Students attend weekly lectures on Friday to discuss consulting cases and various statistical techniques that arise frequently in consulting.

### Data Science Electives (6 -9 units)

In consultation with the student's program advisor, the student selects courses in a scientific or engineering application area of interest, i.e., courses 200 or above in STATS or CME.

**Minimum 6 units of elective coursework.**