The increasing importance of big data in engineering and the applied sciences motivates the Department of Statistics and ICME (Institute for Computational and Mathematical Engineering) to collaboratively offer a M.S. track that trains students in data science with a computational focus.
This focused M.S. track is developed within the structure of the current M.S. in Statistics and the M.S. program in ICME.
ADMISSIONS QUESTIONS SHOULD BE ADDRESSED TO stat-admissions-msATlistsDOT>>>
Upon the successful completion of the Data Science M.S. degree students will be prepared to continue on to a doctoral program in Statistics, ICME, MS&E, or Computer Science or as a data science professional in industry. Completing the M.S. degree gives no guarantee or preference for admission to the Ph.D. program.
Coursework
The Data Science track develops strong mathematical, statistical, computational and programming skills through the general master's core and programming requirements, in addition to providing fundamental data science education through general and focused electives requirement from courses in data sciences and related areas.
As defined in the general Graduate Student Requirements, students have to maintain a grade point average (GPA) of 3.0 or better and classes must be taken at the 200 level or higher. Students satisfying the course requirements of the Data Science track do not have to satisfy the other course requirements for the M.S. in Statistics
The total number of units in the degree is 45, 36 of which must be taken for a letter grade.
Submission of approved Master's Program Proposal, signed by the master's adviser, to the student services officer by the end of the first quarter of the master's degree program. A revised program proposal is required to be filed whenever there are changes to a student's previously approved program proposal.
There is no thesis requirement.
Data Science Program Proposal Form [2016-2017] (PDF) (XLSX) Data Science Program Proposal Form [2017-2018] (PDF) (XLSX)
Students must demonstrate breadth of knowledge in the field by completing five core areas.
Students must demonstrate foundational knowledge in the field by completing the following core courses. Courses in this area must be taken for letter grades.
Course Name & number | Course TItle | Units |
---|---|---|
CME 302 | Numerical Linear Algebra | 3 |
CME 305 | Discrete Mathematics and Algorithms | 3 |
CME 307 | Optimization | 3 |
CME 308 | Stochastic Methods in Engineering | 3 |
or | ||
Randomized Algorithms and Probabilistic Analysis | 3 | |
STATS 310A | Theory of Probability | 3 |
Data Science electives should demonstrate breadth of knowledge in the technical area. The elective course list is defined. Courses outside this list are subject to approval. Courses in this area must be taken for letter grades.
Course Name & number | Course TItle | Units |
---|---|---|
STATS 200 | Introduction to Statistical Inference | 3 |
or STATS 300A | Theory of Statistics I | 3 |
STATS 203 | Introduction to Regression Models and Analysis of Variance | 3 |
or STATS 305A | Introduction to Statistical Modeling | |
STATS 315A | Modern Applied Statistics: Learning | 2-3 |
STATS 315B | Modern Applied Statistics: Data Mining | 2-3 |
or equivalent courses as approved by the adviser. |
To ensure that students have a strong foundation in programming, 3 units of advanced scientific programming for letter grade at the level of CME212 and three units of parallel computing for letter grades are required.
Note: Programming proficiency at the level of CME211 is a hard prerequisite for CME212 (students may ONLY place out of 211 with prior written approval*). CME211 can be applied towards elective requirement.
Course Name & number | Course TItle | Units |
---|---|---|
Advanced Scientific Programming; take 3 units | ||
CME 211 | Software Development for Scientists and Engineers (can only be used as an elective) | 3 |
CME 212 | Advanced Software Development for Scientists and Engineers | 3 |
Parallel Computing/HCP courses: (3 units) | ||
CME 213 | Introduction to parallel computing using MPI, openMP, and CUDA | 3 |
CME 323 | Distributed Algorithms and Optimization | 3 |
CME 342 | Parallel Methods in Numerical Analysis | 3 |
CS 149 | Parallel Computing | 3-4 |
CS 316 | Advanced Multi-Core Systems | 3 |
CS 344C, offered in previous years, may also be counted |
Students who do not start the program with a strong computational and/or programming background will take an extra 3 units to prepare themselves by, for example, taking CME211 Programming in C/C++ for Scientists and Engineer or equivalent course* with adviser's approval.
Choose three courses in specialized areas from the following list. Courses outside this list are subject to approval.
Course Name & number | Course TItle | Units |
---|---|---|
BIOE 214 | Representations and Algorithms for Computational Molecular Biology | 3-4 |
BIOMEDIN 215 | Data Driven Medicine | 3 |
BIOS 221/STATS 366 | Modern Statistics for Modern Biology | 3 |
CS 224W | Social and Information Network Analysis | 3-4 |
CS 229 | Machine Learning | 3-4 |
CS 231N | Convolutional Neural Networks for Visual Recognition | 3-4 |
CS 246 | Mining Massive Data Sets | 3-4 |
CS 448 | Topics in Computer Graphics | 3-4 |
ECON 293 | Machine Learning and Causal Inference | 3 |
ENERGY 240 | Geostatistics | 3 |
OIT 367 | Business Intelligence from Big Data | 3 |
PSYCH 204A | Human Neuroimaging Methods | 3 |
STATS 290 | Computing for Data Science | 3 |
Students are required to take 6 units of practical component that may include any combination of:
A capstone project, supervised by a faculty member and approved by the student's adviser. The capstone project should be computational in nature. Students should submit a one-page proposal, supported by the faculty member and sent to the student's Data Science adviser for approval (at least one quarter prior to start of project).
Master's Research: STATS 299 Independent Study.
Project labs offered by Stanford Data Lab: ENGR 250 Data Challenge Lab, and ENGR 350 Data Impact Lab.
Other courses that have a strong hands-on and practical component, such as STATS 390 Consulting Workshop up to 1unit.
Data Science Sample Schedules
The Data Science track schedule typically spans 5 quarters.
5 quarter schedule for most students:
Year 1:
Aut: CME 200, CME211, STATS200
Wtr: CME212, CME307, STATS200 or 203
Spr: STATS315B, CME308, elective
Year 2:
Aut: CME302, STATS305A, HPC course (or take CME213 in spring), practical
Wtr: CME305, STATS315A, practical, elective
5 quarter schedule for students who are well prepared:
The student must have taken the equivalent of CME200 and STATS200 before starting the program.
Year 1:
Aut: CME211, STATS305A, elective
Wtr: CME212, CME307, STATS203
Spr: CME213, STATS315B, CME308
Year 2:
Aut: CME302, practical, elective
Wtr: CME305, STATS263, STATS315A, elective
4 quarter schedule:
This schedule is very demanding and students typically prefer the experience gained with a 5 quarter schedule.
The student must have taken the equivalent of CME200 and STATS200 (Aut or Wtr) before starting the program.
Year 1:
Aut: CME211, STATS305A, elective
Wtr: CME212, CME305, CME307, STATS315A
Spr: CME213, STATS315B, CME308, practical
Year 2:
Aut: CME302, elective (2), practical
Notes: