State-of-the-Art Statistical Methods for Data Analysis:
Ten Hot Ideas for Learning from Data
Trevor Hastie and Robert Tibshirani, Stanford University
Both Trevor Hastie and Rob Tibshirani were quoted by the New York Times in stories on Big Data.
Executive Conference Center, New York - Sep 18-19, 2012
This two-day course gives a detailed overview of statistical models for data mining, inference and prediction. With the rapid developments in internet technology, genomics, financial risk modeling, and other high-tech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips.
In this course we emphasize the tools useful for tackling modern-day data analysis problems. From the vast array of tools available, we have selected what we consider are the most relevant and exciting. Our top-ten list of topics are:
- Regression and Logistic Regression (two golden oldies),
- Lasso and Related Methods,
- Support Vector and Kernel Methodology,
- Principal Components (SVD) and Variations: sparse SVD, supervised PCA,
- Multidimensional Scaling and Isomap, Nonnegative Matrix Factorization, and Local Linear Embedding,
- Boosting, Random Forests and Ensemble Methods,
- Rule based methods (PRIM),
- Graphical Models,
- Cross-Validation,
- Bootstrap,
- Feature Selection, False Discovery Rates and Permutation Tests.
The material is based on recent papers by the authors and other researchers, as well as the new second edition of our best selling book:
Elements of Statistical Learning: data mining, inference and prediction
Hastie, Tibshirani & Friedman, Springer-Verlag, 2008 (2nd edition)
Go to www-stat.stanford.edu/~hastie/sldm.html for more information.