Pedro Domingos: A few useful things to know about Machine Learning

A leading researcher explains the "folk knowledge" to successfully develop machine learning applications. This knowledge is not easily available in texbooks, and lack of this knowledge will impair many machine learning and data mining projects.

Pedro Domingos is a Professor at U. Washington, a leading researcher in machine learning and data mining, and a winner of many awards, including 2 best-paper awards at KDD (Knowledge Discovery and Data Mining) conferences.

Introduction
... A recent report from the McKinsey Global Institute asserts that machine learning (a.k.a. data mining or predictive analytics) will be the driver of the next big wave of innovation [15]. Several fine textbooks are available to interested practitioners and researchers (e.g, [16, 24]). However, much of the "folk knowledge" that is needed to successfully develop machine learning applications is not readily available in them. As a result, many machine learning projects take much longer than necessary or wind up producing less-than-ideal results. Yet much of this folk knowledge is fairly easy to communicate. This is the purpose of this article. A few useful things to know about Machine Learning, Communications of the ACM, Vol. 55 No. 10, Pages 78-87, 2012.

Here is a free version.

Bias vs Variance

Contents:

Learning= Representation + Evaluation + Optimization
It's Generalization That Counts
Data Alone Is Not Enough
Overfitting Has Many Faces
Intuition Fails In high dimensions
Theoretical guarantees are not what they seem
Feature Engineering Is The Key
More Data Beats A Cleverer Algorithm
Learn Many Models, Not Just One
Simplicity Does Not Imply Accuracy
Representable Does Not Imply Learnable
Correlation Does Not Imply Causation