Peering into the Black Box and Explainability

In many domains, where data science can be a game changer, and the biggest hurdle is not collecting data or building the models, it is Understanding what they mean.

By Gurjeet Singh, Ayasdi.


This is a question that many a data scientist struggle with in communicating with business. In all fairness, there are plenty of business situations which require the models to be transparent, such as:

  1. Regulations – in many industries such as financial services and healthcare, the models used to make predictions are required to be audited by regulators by law. Requiring a regulator to learn about some fancy machine learning method is a non starter.
  2. Operations – in cases where human beings need to act on the outcome of the model. Imagine a denials management solution in healthcare – a claims processor often needs to call a hospital system to discuss a claim that they disagree about. Imagine that the said claim was marked as fraudulent by the insurer – it is so much better to arm the claim processor with the quick summary of why the model flagged this claim as fraudulent to begin with.
  3. Automated systems – another situation where models are used extensively is when the volume of transactions is overwhelming or the outcome is fleeting. Imagine an automated high frequency trading system. A portfolio manager needs to understand the models behind their trading strategy in order to justify the investment.

In this blog post, I want to discuss a solution to this problem: separate the model and the explanation.

Let’s begin by understanding what is a black box anyway? A black-box model is any model that a human reviewer cannot reason about. Generally speaking, the only ‘white-box’ models that seem to satisfy this constraint are: shallow decision trees or sparse linear models. Both of these models tend are simple enough (by definition) that you can explain them to an eighth grader.

Do we really need models that are much more complicated?

Well, if you look at the cutting edge in any field such as computer vision, natural language processing etc. the models that perform the best either tend to be extremely complicated or need billions of parameters (especially in deep learning). Suffice it to say that whenever high accuracy is a concern, model complexity tends to be high.

Instead of having to choose between high accuracy and high explainability, these can be thought of as different issues. There are many ways of doing this, but here is one simple example.

Imagine that we are given at dataset X which contains N points and each point in the dataset is a d-dimensional vector. In addition, we are given the output of the classifier for each point as an N-dimensional, binary vector f. Let’s say that X0 is the subset of X where f is 0 and X1 is the subset of X where f is 1. Here’s a simple algorithm:

  1. For each of the d-dimensions of X, find its mean and standard deviation, call it mi  and i for the i-th dimension respectively.
  2. For each of the d-dimensions of X0, find its mean. Let’s call it ui for the i-th dimension.
  3. For each of the d-dimensions of X1, find its mean. Let’s call it vi for the i-th dimension.
  4. Now, sort the list of columns, by |ui-vi|/i.

This algorithm returns a list of columns, sorted by their ‘contribution’ to final classifier output.

This is just one simple example with many possible extensions such as:

  1. Replace the z-score by a better statistical test, such as the Kolmogorov-Smirnov test.
  2. Extend the test to categorical dimensions.
  3. Use a multivariate score (such as GEMM).
  4. Use a shallow decision tree as a discriminator (!).
  5. Extend for use in clustering by treating different clusters as classes.
  6. Extend for use in dimensionality reduction by allowing the user to select regions of output.

Hopefully this brief example demonstrates that the complexity of a model need not be a barrier to the understanding of a model.

One final point.

One of the reasons that machine intelligence is not yet widespread inside of enterprises is that it is not well understood. Our desire to understand is a function of trust. Enterprises need to trust their models. As our understanding of machine intelligence increases, our trust will increase.

Conversely, we may, as a society choose to lower our requirements for trust if the outcomes are so vastly superior. It is my personal belief that this will be the path we take based on the extraordinary gains presented by these new approaches.

Until then though, let’s use white box models to understand black box ones.

If you have questions about some of the mechanics outlined here drop me a note @singhgurjeet.