Explainable AI or Halting Faulty Models ahead of Disaster

A brief overview of a new method for explainable AI (XAI), called anchors, introduce its open-source implementation and show how to use it to explain models predicting the survival of Titanic passengers.

By Tobias Goerke

Equipping Titanic with Anchors: Halting Faulty Models ahead of Disaster

On Kaggle, a surprisingly large number of the Titanic challenge entries exhibit a score of 100%. Experienced machine learning experts will know about the challenge's complexity and rightfully question the results' validity. At the same time, submissions like this Notebook illustrate how the Titanic competition's leaderboard can be forged effortlessly; A top-performing model can be created by collecting and including the publicly accessible list of survivors. Clearly, such overfit models only work for one very specific use case and are virtually useless for predicting outcomes in any other situation (not to mention the ethics of cheating). So, how can we make sure we have trained or are provided with a model that we can actually use in production? How can machine learning systems be deployed without likely ensuing disaster?

Explainable artificial intelligence, or XAI, seeks to answer these questions. It aims to provide explanations which are understandable for humans to ultimately increase trust in produced models. Based on the idea that humans can superiorly generalize even in unknown situations, a user can then check this explanation and see if the model uses the right cues to infer its decision. If it does, it probably classifies similar, albeit unknown, situations correctly and creates trust.


A novel addition to this new field of research, called anchors, seems particularly promising (find the paper here). It was proposed by the inventor of the LIME approach and patches several shortcomings of its predecessors. Anchors provides local explanations to any black box classifier regardless of the underlying technology and algorithm. Each explanation is valid for a single selected prediction using IF-THEN rules, of which each predicate fixes one feature of the input instance. Therefore, each result provides clear coverage, i.e., states for which other instances it is valid and with which probability.

This article examines two different models that were trained on the Titanic training dataset. The anchors algorithm is then used to explain why and based on which features these black boxes predict a passenger's survival or death.

Implementation Process Overview

Before going into detail with the tutorial's implementation, anchors' mode of operations and components is briefly outlined: at its core, the algorithm deploys a perturbation-based strategy. That means that the observed or explained instance gets perturbed, i.e., its feature values changed to some application-specific policy. The resulting data instances resemble neighbors of the initial instance. This way, feature importances, and contributions can be determined systematically by evaluating them using the model.
Understandably, querying the model often is expensive. Also, there can be no exhaustive search in the case of continuous or sufficiently complex models. Reinforcement learning and its multi-armed bandits (MABs) provide a solution to this problem. They help to significantly reduce the number of samples required by using stochastic exploration approaches.

The above process, algorithmic improvements and several add-ons are implemented by our newly released anchorj Java application. anchorj was designed as a high-performance alternative to the initial author's proof-of-concept (see here) and can be found on GitHub. It constitutes the first open-source Java anchors implementation and is licensed under the BSD 3-Clause License. Thus, it can be used freely with minimal restrictions.

Open collaboration and discussions on this open-source GitHub project are more than welcome.

The Explained Instances

We produce explanations of two data instances from the training set that exhibit distinct attributes and whose explanations can well be comprehended by users.

  1. Lucy Noel Martha Dyer-Edwards, The Countess of Rothes, Age 33, First Class Passenger, Cabin B77, Fare 86.5$, Survived.
  2. Mr. Patrick Dooley, Age 32, Third Class Passenger, Cabin unknown, Fare 7.75$, Did not survive.

Humans can most probably make an educated guess about why the Countess of Rothes did survive the Titanic disaster, while third class passenger Mr. Dooley did not (see here for more info). Using this knowledge users can validate the models when given explanations about a model's functioning.

Preparation: Dataset and Models

The provided .csv files containing all data instances are easily loaded by using our implementation. Further, models are used that are able to predict outcomes for these instances. These will later be explained by using anchorj. One is a pre-trained GBM imported from h4O/R and the other is a random forest. However, it really does not matter which kind of model is used. Anchors is able to deal with any type of model, as long as its predict/classify function is accessible.

Creating Explanations

Using anchorj is as simple as defining the model and the instance to be explained. All other parameters can be configured optionally.

AnchorResult anchor = anchorTabular
    .createDefaultBuilder(h4oModel::predict, explainedInstance)
    // Many more options

The results are visualized below:

Countess of Rothes

Imported Model:

IF Sex = 'female' AND 
SibSp = '0' AND 
Pclass = '1' AND 
Age IN RANGE [28,38]

Random Forest:

IF Sex = 'female' AND 
Fare IN RANGE [53,512] AND 
Pclass = '1' 

Patrick Dooley

Imported Model:

IF Sex = 'male' AND 
Age IN RANGE [28,38] AND 
Fare IN RANGE [0,8] AND 
SibSp = '0' 

Random Forest:

IF Sex = 'male' AND 
Pclass = '3' AND 
Cabin = 'false' AND 
Parch = '0' 

Explaining the explanation: a result's first part consists of its predicates, i.e., conditions and prediction specifying for which instances it is valid. After this, the precision and coverage are stated. In our case, coverage refers to the share of entries the rule holds for. The last rule, for example, covers 33% of the instances, meaning 33% are male, third-class and so forth. In these cases, the result is 100% precise, meaning for these passengers the prediction is the same with a 100% probability.

These results show that both models have learned mostly correct associations. Both models come to their decisions by "thinking" that female first-class passengers likely survive, while male passengers do not. However, the imported model takes into account more specific features that are probably harder to generalize.
In combination with the low accuracy and coverage, this points to a faulty model which probably generalizes poorly. On the contrary, the random forest takes features into account that we would actually expect to be present in an explanation. Its explanations also exhibit a high coverage, indicating it has learned generally valid coherencies.
Nonetheless, both models miss taking the name of the passengers into account. We would expect longer, aristocratic, names to be an indicator for survival. The knowledge we gained out of these explanations could be used to amend the random forests model's training process until, ultimately, explanations are satisfactorily, and build trust.

Summary and Collaboration

Anchors and anchorj aim to make machine learning productively viable by providing the means to detect faulty models before they are deployed. They close the gap between opportunities created by machine learning technology and the associated risks. In our example, we have shown how a specific model can be validated or refuted by including humans into the loop.

Anchors' application is not limited to this type of problem. It can, for example, be used by global explainers to explain a larger part of the model. Such algorithms and various other features are included in our implementation. See you on GitHub!

Bio: Tobias Goerke is an IT-Consultant and XAI researcher at the viadee Consulting AG, Germany. He recently finished his M.Sc. in Information Systems.



No, thanks!