# Short course: Statistical Learning and Data Science, Palo Alto, Apr 18-19

This *new* two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference, with emphasis on tools useful for tackling modern-day data analysis problems.

**Statistical Learning**

and Data Science

and Data Science

by Trevor Hastie and Robert Tibshirani, Stanford University

Sheraton Hotel, Palo Alto, CA

April 18-19, 2016

This new two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference. With the rapid developments in internet technology, genomics, financial risk modeling, and other high-tech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips.

In this course we emphasize the tools useful for tackling modern-day data analysis problems. Many of these are essential building blocks, but we also include techniques at the cutting-edge of technology for handling big-data problems. From the vast array of tools available, we have selected what we consider are the most relevant and exciting. Our list of topics include:

- Linear methods: regression, logistic regression (binary and multiclass), Cox model.
- Bootstrap, cross-validation, and permutation methods.
- Regularized linear models: ridge, lasso, elastic net. Post-selection inference. Glmnet package in R, and other software.
- Trees, random forests, and boosting.
- Unsupervised methods: clustering (prototype, hierarchical, spectral,...), principal components and other low-rank methods, sparse decompositions.
- Support-vector machines and kernel methods.
- Deep learning and neural networks.

Our earlier courses are not a prerequisite for this new course. Although there is overlap with past courses, our new course contains topics not covered by us before. We illustrate many of the methods using examples developed in R.

The material is based on recent papers by the authors and other researchers, as well as our best selling books:

- An Introduction to Statistical Learning, with applications in R (with Gareth James and Daniela Witten, Springer-Verlag, 2013).
- Statistical Learning with Sparsity: the Lasso and Generalizations (with Martin Wainwright, Chapman and Hall, 2015).
- Elements of Statistical Learning: data mining, inference and prediction (2nd Edition) (with J. Friedman, Springer-Verlag, 2009).

A copy of the first two books will be given to all attendees.

The lectures will consist of video-projected presentations and discussion.

Go to www-stat.stanford.edu/~hastie/sldm.html

for more information and online registration.