Interpreting Machine Learning Models: An Overview
This post summarizes the contents of a recent O'Reilly article outlining a number of methods for interpreting machine learning models, beyond the usual goto measures.
An article on machine learning interpretation appeared on O'Reilly's blog back in March, written by Patrick Hall, Wen Phan, and SriSatish Ambati, which outlined a number of methods beyond the usual goto measures. By chance I happened back upon the article again over the weekend, and with a fresh read decided to share some of the ideas contained within. The article is a great (if lengthy) read, and recommend it to anyone who has the time.
The article is organized as follows:
 Overview of the differing complexities of (machine learning) functions to be explained
 Overview of the scope of interpretability, local (small regions of conditional distributions) vs. global (entire conditional distributions)
 A discussion of understanding and trust, and how traditional measures of understanding  such as crossvalidation and assessment plots  often don't do enough to inspire trust in a model
 A breakdown into 3 sections of interpretation techniques (the real meat of the article)
Part 1 includes approaches for seeing and understanding your data in the context of training and interpreting machine learning algorithms, Part 2 introduces techniques for combining linear models and machine learning algorithms for situations where interpretability is of paramount importance, and Part 3 describes approaches for understanding and validating the most complex types of predictive models.
The deconstruction of the interpretability of each technique and group of techniques is the focus of the article, while this post is a summary of the techniques.
Part 1: Seeing your data
This section starts the article off slowly, and points out some methods to perform visual data exploration beyond the more traditional.
[T]here are many, many ways to visualize data sets. Most of the techniques highlighted below help illustrate all of a data set in just two dimensions, not just univariate or bivariate slices of a data set (meaning one or two variables at a time). This is important in machine learning because most machine learning algorithms automatically model highdegree interactions between variables (meaning the effect of combining many i.e., more than two or three variables together).
Visualization techniques presented in this section include:
 Glyphs
 Correlation graphs
 2D projections, such as PCA, MDS, and tSNE
 Partial dependence plots
 Residual analysis
A correlation graph representing loans made by a large financial firm. Figure courtesy of Patrick Hall and the H2O.ai team.
Recommended questions to be asked to help determine the value of these visualization techniques (which are similarly asked of techniques in subsequent parts) include:
 What complexity of functions can visualizations help interpret?
 How do visualizations enhance understanding?
 How do visualizations enhance trust?
Part 2: Using machine learning in regulated industry
Things get a bit more interesting here.
The techniques presented in this section are newer types of linear models or models that use machine learning to augment traditional, linear modeling methods. These techniques are meant for practitioners who just can't use machine learning algorithms to build predictive models because of interpretability concerns.
Techniques outlined in this section include:
 Generalized Additive Models (GAMs)
 Quantile regression
 Building toward machine learning model benchmarks  that is, employ a deliberate process when moving from traditional linear models toward machine learning algorithms, taking baby steps, and comparing performance and outcomes along the way, as opposed to jumping from a simple regression model into the deep end with black boxes
 Machine learning in traditional analytics processes  this is the suggestion to use ML algorithms to augment analytical lifecycle processes for more accurate predictions, such as predicting linear model degredation
 Small, interpretable ensembles  it's a foregone conclusion at this point that there is massive value in ensemble methods, in general, but employing simple such methods can potentially help boost both accuracy as well as interpretability.
 Monotonicity constraints  such constraints can possibly transform complex models into interpretable, nonlinear, montonic models
Monotonicity is very important for at least two reasons:
 Monotonicity is often expected by regulators
 Monotonicity enables consistent reason code generation
A diagram of a small, stacked ensemble. Figure courtesy of Vinod Iyengar and the H2O.ai team.
Part 3: Understanding complex machine learning models
In my opinion, this is where things get especially interesting. I approach complex machine learning model interpretability as an advocate of automated machine learning, since I feel the two techniques are flipsides of the same coin: if we are going to be using automated techniques to generate models on the frontend, then devising and employing appropriate ways to simplify and understand these models on the backend becomes of utmost importance.
Here are the approaches outlined for helping understand complex ML models within this article.
Surrogate models  simply, a surrogate is a simple model which can be used to explain a more complex model. If the surrogate model is created by training, say, a simple linear regression or a decision tree with original input data and predictions from the more complex model, the characteristics of the simple model can then be assumed to be an accurately descriptive standin of the more complex model. And it may not not be accurate at all.
Why, then, employ a surrogate model?
Surrogate models enhance trust when their coefficients, variable importance, trends, and interactions are in line with human domain knowledge and reasonable expectations of modeled phenomena. Surrogate models can increase trust when used in conjunction with sensitivity analysis to test that explanations remain stable and in line with human domain knowledge, and reasonable expectations when data is lightly and purposefully perturbed, when interesting scenarios are simulated, or as data changes over time.
Local Interpretable Modelagnostic Explanations (LIME)  LIME is for building surrogate models based on a single observation
An implementation of LIME may proceed as follows. First, the set of explainable records are scored using the complex model. Then, to interpret a decision about another record, the explanatory records are weighted by their closeness to that record, and an L1 regularized linear model is trained on this weighted explanatory set. The parameters of the linear model then help explain the prediction for the selected record.
Maximum activation analysis  a technique which looks to isolate particular instances which elicit a maximum response of some model hyperparameter
In maximum activation analysis, examples are found or simulated that maximally activate certain neurons, layers, or filters in a neural network or certain trees in decision tree ensembles. For the purposes of maximum activation analysis, low residuals for a certain tree are analogous to highmagnitude neuron output in a neural network.
So... LIME, or maximum activation analysis, or both?
LIME, discussed above, helps explain the predictions of a model in local regions of the conditional distribution. Maximum activation analysis helps enhance trust in localized, internal mechanisms of a model. The two make a great pair for creating detailed, local explanations of complex response functions.
Sensitivity analysis  this technique helps to determine whether intentionally perturbed data, or similar data changes, modify model behavior and destabilizes the outputs; it is also useful for investigating model behavior for particular scenarios of interest or corner cases
Global variable importance measures  typically the domain of treebased models; a heuristic for determining varibale importance relates to the depth and frequency of a given variable's split point; higher and more frequent variables are decidedly more important
For example:
For a single decision tree, a variable's importance is quantitatively determined by the cumulative change in the splitting criterion for every node in which that variable was chosen as the best splitting candidate.
An illustration of variable importance in a decision tree ensemble model. Figure courtesy of Patrick Hall and the H2O.ai team.
LeaveOneCovariateOut (LOCO)  "modelagnostic generalization of mean accuracy decrease variable importance measures"; originally developed for regression models, but applicable more generally; the technique's goal is to determine the variable with the largest absolute impact on a given row  by iteratively zeroing out variable values  which is determined to be the most important for that row's prediction
How do variable importance measures enhance understanding?
Variable importance measures increase understanding because they tell us the most influential variables in a model and their relative rank.
Treeinterpreter  strictly a treebased model (decision trees, random forest, etc.) interpretation approach
Treeinterpreter simply outputs a list of the bias and individual contributions for a variable in a given model or the contributions the input variables in a single record make to a single prediction.
I am currently experimenting with Treeinterpreter and hope to soon share my experience.
To reiterate, this O'Reilly article which I have superficially summarized merits a full reading if you have time, wherein it fleshes out the techniques of which I have only scratched the surface, and does a much better job than did I of weaving together how useful each technique is and what one should expect as a resultof its utilization.
A thanks to Patrick Hall, Wen Phan, and SriSatish Ambati, the authors of this informative article.
Related:
Top Stories Past 30 Days

