Pedro Domingos is a Professor at U. Washington, a leading researcher in machine learning and data mining, and a winner of many awards, including 2 best-paper awards at KDD (Knowledge Discovery and Data Mining) conferences.
Introduction
... A recent report from the McKinsey Global Institute asserts that machine learning (a.k.a. data mining or predictive analytics) will be the driver of the next big wave of
innovation [15]. Several fine textbooks are available to interested practitioners and researchers (e.g, [16, 24]). However, much of the "folk knowledge" that is needed to successfully
develop machine learning applications is not readily available in them. As a result, many machine learning projects take much longer than necessary or wind up producing less-than-ideal results. Yet much of this folk knowledge is fairly
easy to communicate. This is the purpose of this article.
A few useful things to know about Machine Learning, Communications of the ACM, Vol. 55 No. 10, Pages 78-87, 2012.
Here is a free version.
Contents:
- Learning= Representation + Evaluation + Optimization
- It's Generalization That Counts
- Data Alone Is Not Enough
- Overfitting Has Many Faces
- Intuition Fails In high dimensions
- Theoretical guarantees are not what they seem
- Feature Engineering Is The Key
- More Data Beats A Cleverer Algorithm
- Learn Many Models, Not Just One
- Simplicity Does Not Imply Accuracy
- Representable Does Not Imply Learnable
- Correlation Does Not Imply Causation