What is your AI telling you? The importance of explaining predictions

Machine learning has experienced a fast innovation process in recent decades. On an almost daily basis, complex algorithms enter the public scene: today a deep learning technique beats expert radiologists in identifying pathological medical scans, tomorrow a novel ensemble method may predict local weather with hitherto inconceivable accuracy. While these are exciting news, they come at a price.
The more non-linear patterns machine learning models are able to detect – not to mention the complex and variegated interactions among variables – the more difficult it becomes to understand and explain such predictions. Why is this important? Wouldn’t it be enough to perform a prediction and enjoy the results?
Not really. One reason is accountability.
The US Congress recently introduced the Algorithmic Accountability Act. Similar legislations are being proposed in other countries too. According to these new regulations, companies using AI tools for decision support are required to audit their machine learning models for bias and discrimination. In this realm, it is not surprising that the ability to explain the predictions provided by machine learning models is mandatory not only from an ethical perspective but, provided this regulations become laws, from a legal perspective too. Moreover, the clarity of decisions that are taken by machine learning models is crucial in sensitive contexts such as healthcare, banking and industrial plants. Without loss of generality, these are all domains in which a machine learning model can, in fact, determine: if a patient is given a specific therapy; if an individual receives credit from a bank; and if industrial equipment is shut down to avoid catastrophic failures.
There is a tool designed for the express purpose of interpreting AI models. It is called SHAP.[2]

The Shapley additive explanations (SHAP) algorithm

According to its authors, the SHAP algorithm provides an interpretation of the predictions performed by complex machine learning models in a way that matches human intuition. How would that be possible?
It should be clear that in case of simple models like linear regression, by looking at the regression coefficients, one is able to immediately grasp how each predictor influences the prediction of the response variable. In contrast, in case of nonlinear methods—e.g., deep neural networks, random forest—it is unclear to what extent increasing or decreasing the value of a certain input will impact the output. For this purpose, one should use an interpretable approximation of the original model, which we will refer to as an explanation model.

Let’s see how we can build one.
Suppose one is given a dataset of N data points \inline \{\mathbf{x}_i, y_i\}_{i=1}^N, where each instance \inline \mathbf{x}_i is described by a certain number M of features, and \inline y_i is the usual target variable (i.e., what one wants to predict, given the inputs). For simplicity, let’s refer to \inline \mathbf{x}_i as houses with certain characteristics such as number of bedrooms, GPS coordinates, area of the living room, etc., and \inline y_i as its price.
If one denotes the original model as f and the explanation model as g, the latter can be expressed as:

g(\mathbf{x'}) = \phi_0 + \sum_{j=1}^M \phi_j \mathbf{x}'_j

The above formula means that the explanation model first applies an effect \phi_j to every feature, and then it sums up the effects of all feature attributions to approximate the output of the original model. An important fact to note is that the explanation model does not use the original input data points \mathbf{x}_i, but a simplified version \inline \mathbf{x'_i}. Such simplified observations are binary vectors in which each entry j may indicate i) presence or absence of the j-th feature when predicting the output for a certain observation ii) if feature j is set to the a reference value (e.g., its average, median.) or it takes the original value. Correspondingly, \inline \phi_j represents the effect of including that feature in the prediction, and it is referred to as the SHAP value. To compute such effect, a model is trained with that feature included, and another model is trained without that feature. Hence, the predicted values provided by the two models given the same input, are compared. Clearly the difference in these predictions gives a measure of the impact of that variable on the target. In the above formula, \inline \phi_j would measure the impact on the final price of the house of increasing the number of bedrooms by a certain amount with respect to a reference value.

Let’s see the SHAP algorithm in action in a practical Python example. In the following code snippet, we build a K-nearest-neighbor (KNN) classifier to predict the probability of default of credit card holders from an important bank in Taiwan (see Reference [3] for additional details about the dataset).

In the beginning, beyond the typical libraries for data processing and machine learning (pandas, numpy, scikit-learn) one needs to load the shap library too. In order to use it, one has to initialise the visualisation tool with command shap.initjs() and instantiate the explainer with command shap.KernelExplainer(). Whit this command, one is explicitly asking for an explanation model for the KNN classifier, using the median value of each feature as reference. By calling method explainer.shap_values() on the test data points, it is possible to compute how each feature contributes to the predicted probability of defaults.

By executing the above code, the following figure will be generated:

Relative impact of the most important features on the predicted probability of default of credit card holders from a Taiwanese bank.

The picture above shows the impact (i.e., the SHAP values) of the most important features on the predicted probability of default in October 2005, given information up to September 2005. The SHAP values of every feature for every sample are plotted on the x axis and are centered around 0, which represents the impact of that feature if it is set to its reference value (the median in this case). The color represents the feature value (red high, blue low). To understand how to interpret this figure, let’s take a look at the variable called PAY_0. This feature indicates the months of delays in past payments on September 2005, (e.g. PAY_0 = 2 means that in September 2005 a certain client did not pay some debit that was due 2 months before). Hence, it seems that the presence of consistent delays in payments in the period of September 2005 (red dots on the far right) may increase the log odds of default in October 2005 until 0.3, compared to the typical payment delay of all credit card holders (i.e. the median delay whose impact is considered 0). Makes sense, right? The more one struggles with past payments the less likely one is to be able to pay his or her debit. For more examples of using the SHAP algorithm to interpret model predictions, the interested reader can refer to [4].


Algorithmic bias has brought a number of issues in many domains and contexts: from Amazon’s internal hiring tool that penalised female candidates, to commercial facial analysis and recognition services found to be less accurate for darker-skinned women compared to lighter-skinned men, just to name a few.
Techniques such as SHAP, which explain the predictions produced by complex machine learning models, have the potential to greatly reduce the risks associated with algorithmic bias while increasing fairness and transparency in all decisions taken by AI tools. This will immensely important, especially in critical domains like healthcare, banking and social media.


[1] https://www.technologyreview.com/s/613310/congress-wants-to-protect-you-from-biased-algorithms-deepfakes-and-other-bad-ai/

[2] M. Scott Lundberg, et al., “A Unified Approach to Interpreting Model Predictions”, Advances in Neural Information Processing Systems 30, pages 4765-4774, 2017.

[3] I-Cheng Yeh, et al., “The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients”, Expert Systems with Applications, Volume 36, Issue 2, Part 1, pages 2473-2480, 2009.

[4] SHAP (SHapley Additive exPlanations) official Github repository https://github.com/slundberg/shap

Subscribe to our Newsletter

Leave a Reply

Your email address will not be published. Required fields are marked *