Statistical Learning and Data Mining III: 10 Hot Ideas for Learning from Data, Mar 1920, Palo Alto
Taught by top Stanford professors and leading statisticians Trevor Hastie and Robert Tibshirani, this course presents 10 hot ideas for learning from data, and gives a detailed overview of statistical models for data mining, inference and prediction.
StateoftheArt Statistical Methods for Data Analysis:
Ten Hot Ideas for Learning from Data
Trevor Hastie and Robert Tibshirani, Stanford University
Sheraton Hotel, Palo Alto, California  March 1920, 2015
This twoday course gives a detailed overview of statistical models for data mining, inference and prediction. With the rapid developments in internet technology, genomics, financial risk modeling, and other hightech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips.
This course is the third in a series, and follows our popular past offerings "Modern Regression and Classification", and "Statistical Learning and Data Mining".
The two earlier courses are not a prerequisite for this new course.
In this course we emphasize the tools useful for tackling modernday data analysis problems. These include gradient boosting, SVMs and kernel methods, random forests, lasso and LARS, ridge regression and GAMs, supervised principal components, and crossvalidation. We also present some interesting case studies in a variety of application areas.
This course focuses on both "tall" data ( N>p where N=#cases, p=#features) and "wide" data (p>N). Typical examples of tall data are credit risk and churn prediction, and email spam filtering. Topics include linear and ridge regression, lasso, and LARS, support vector machines, random forests and boosting. We give indepth discussion of validation, crossvalidation and test set issues.
For wide data, typical examples are gene expression and protein mass spectrometry data, and data from signals and images. Topics include clustering and data visualization, false discovery rates and SAM, regularized logistic regression and discriminant analysis, supervised and unsupervised principal components, support vector machines and the kernel trick, and the careful use of model selection strategies.
The material is based on recent papers by the authors and other researchers, as well as the best selling book:
Statistical Learning: data mining, inference and prediction, Hastie, Tibshirani & Friedman, SpringerVerlag, 2008
A copy of this book will be given to all attendees.
The lectures will consist of videoprojected presentations and discussion. Go to the site
wwwstat.stanford.edu/~hastie/sldm.html for more information and online registration.
Ten Hot Ideas for Learning from Data
Trevor Hastie and Robert Tibshirani, Stanford University
Sheraton Hotel, Palo Alto, California  March 1920, 2015
This twoday course gives a detailed overview of statistical models for data mining, inference and prediction. With the rapid developments in internet technology, genomics, financial risk modeling, and other hightech industries, we rely increasingly more on data analysis and statistical models to exploit the vast amounts of data at our fingertips.
This course is the third in a series, and follows our popular past offerings "Modern Regression and Classification", and "Statistical Learning and Data Mining".
The two earlier courses are not a prerequisite for this new course.
In this course we emphasize the tools useful for tackling modernday data analysis problems. These include gradient boosting, SVMs and kernel methods, random forests, lasso and LARS, ridge regression and GAMs, supervised principal components, and crossvalidation. We also present some interesting case studies in a variety of application areas.
This course focuses on both "tall" data ( N>p where N=#cases, p=#features) and "wide" data (p>N). Typical examples of tall data are credit risk and churn prediction, and email spam filtering. Topics include linear and ridge regression, lasso, and LARS, support vector machines, random forests and boosting. We give indepth discussion of validation, crossvalidation and test set issues.
For wide data, typical examples are gene expression and protein mass spectrometry data, and data from signals and images. Topics include clustering and data visualization, false discovery rates and SAM, regularized logistic regression and discriminant analysis, supervised and unsupervised principal components, support vector machines and the kernel trick, and the careful use of model selection strategies.
The material is based on recent papers by the authors and other researchers, as well as the best selling book:
Statistical Learning: data mining, inference and prediction, Hastie, Tibshirani & Friedman, SpringerVerlag, 2008
A copy of this book will be given to all attendees.
The lectures will consist of videoprojected presentations and discussion. Go to the site
wwwstat.stanford.edu/~hastie/sldm.html for more information and online registration.
Top Stories Past 30 Days

