Short course: Statistical Learning and Data Mining IV, NYC, Nov 2-3
This new two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference, with emphasis on tools useful for tackling modern-day data analysis problems.
State-of-the-Art Statistical Methods for Data Science, including sparse models and deep learning
by Trevor Hastie and Robert Tibshirani,
Stanford University
Executive Conference Center, New York, NY
Nov 2-3, 2017
This new two-day course gives a detailed and modern overview of statistical models used by data scientists for prediction and inference. With the rapid developments in internet technology, genomics, financial risk modeling, and other high-tech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips.
In this course we emphasize the tools useful for tackling modern-day data analysis problems. Many of these are essential building blocks, but we also include techniques at the cutting-edge of technology for handling big-data problems. From the vast array of tools available, we have selected what we consider are the most relevant and exciting. Our list of topics includes:
- Linear methods: regression, logistic regression (binary and multiclass), Cox model.
- Bootstrap, cross-validation, and permutation methods.
- Regularized linear models: ridge, lasso, elastic net. Post-selection inference. Glmnet package in R, and other software.
- Trees, random forests, and boosting.
- Unsupervised methods: clustering (prototype, hierarchical, spectral,...), principal components and other low-rank methods, sparse decompositions.
- Support-vector machines and kernel methods.
- Deep learning and neural networks.
The material is based on recent papers by the authors and other researchers, as well as our best selling book:
Elements of Statistical Learning: data mining, inference and prediction (2nd Edition) (with J. Friedman, Springer-Verlag, 2009).
The lectures will consist of high-quality projected presentations and discussion. A copy of Elements of Statistical Learning will be given to all attendees, as well as a color booklet containing the course slides in a convenient two-up, double-sided format.
The authors have two other popular books that are also relevant to this course:
- An Introduction to Statistical Learning, with applications in R (with Gareth James and Daniela Witten, Springer-Verlag, 2013).
- Statistical Learning with Sparsity: the Lasso and Generalizations (with Martin Wainwright, Chapman and Hall, 2015).
Go to www-stat.stanford.edu/~hastie/sldm.html
for more information and online registration.