In recent decades machine learning has experienced tremendous success and a quick innovation process. While these are exciting news, they come at a price. The more non-linear patterns machine learning models are able to detect, the harder it becomes to understand and explain such predictions. Why is explaining predictions important in machine learning ? Wouldn’t it be enough to just enjoy its results? Not really. One reason is accountability.
Reasons for an explainable AI
The US Congress recently introduced the Algorithmic Accountability Act (see ). Other countries are considering similar laws. According to these new regulations, companies using AI tools for decision support have to audit their machine learning models for bias and discrimination. Explaining how machine learning models make their predictions is then becoming mandatory. Moreover, some sensitive industries (healthcare, banking, industrial plants..) need AI models that operate in a clear, well-defined way. In these domains the wrong decision could have a serious impact on people’s daily lives. Let’s see some examples. A healthcare algorithm could decide that a patient is not suitable to get a certain therapy. A banking algorithm could deny credit to a person. An AI for industrial control may not notice a plant is about to explode.
A useful tool called SHAP helps to interpret AI models.
The Shapley explanation algorithm (SHAP)
SHAP’s aim is explaining the predictions made by machine learning models. How does it work ? Let’s consider the simplest of all models, linear regression. In this case, the coefficients tell the influence of each parameter on the final result. In contrast, in case of nonlinear methods—e.g., deep neural networks or random forests— the relative importance of each input is not clear. That’s why we need to use an interpretable approximation of the original model. We’ll cal that approximate model an explanation model. SHAP is such an approximation.
Suppose you have a dataset of N data points . Each instance has a certain number M of features, and is what you want to predict (the target variable).
For example are houses with certain features (number of bedrooms, GPS coordinates, area of the living room, etc.), and are their listing prices.
Let’s call the original model f and the explanation model g. You can write the latter as:
The explanation model first applies an effect to every feature, and then it sums up the effects of all feature attributions to approximate the output of the original model. Note is that the explanation model does not use the original input data points , but a simplified version . These simplified observations are binary vectors. Each vector j may indicate:
- presence or absence of the j-th feature when predicting the output for a certain observation
- Whether the feature j is set to the a reference value (e.g., its average, median.) or it takes the original value.
Correspondingly, (the SHAP value) represents the effect of including that feature in the prediction.To compute such effect, we train a model with that feature included, and another one without that feature. Hence, we compare the predicted values provided by the two models given the same input. The difference in these predictions gives a measure of the impact of that variable on the target.
In the above formula, would measure the impact on the final price of the house of increasing the number of bedrooms by a certain amount with respect to a reference value.
SHAP in action
Let’s see the SHAP algorithm in action in a practical Python example. In the following code snippet, we build a K-nearest-neighbor (KNN) classifier to predict the probability of default of credit card holders from an important bank in Taiwan (see  for additional details about the dataset).
A quick tour of the code. First you load the typical libraries for data processing and machine learning (
scikit-learn). Then you have to load the shap library too. Before use, you must initialise the visualisation tool using
hap.initjs(). Then you call the explainer with
shap.KernelExplainer(). This command produces an explanation model for the KNN classifier. This explanation model uses the median value of each feature as reference. It’s possible to compute the relative weight of each feature on the prediction by calling method
explainer.shap_values(). Of course, you should do that on the testing dataset.
Results and their interpretation
Running the code produces the picture on the left. The figure above shows the impact (i.e., the SHAP values) of the most important features on the predicted probability of default in October 2005, given information up to September 2005. The SHAP values of every feature for every sample are plotted on the x axis and are centered around 0, which represents the impact of that feature if it is set to its reference value (the median in this case). The color represents the feature value (red high, blue low).
For example, let’s take a look at the variable called PAY_0. This feature indicates the months of delays in past payments on September 2005, (e.g. PAY_0 = 2 means that a certain client did not pay some debit that was due 2 months before). Hence, it seems that the presence of consistent delays in payments (red dots on the far right) may increase the log odds of default up to 0.3, compared to the median delay (whose impact is considered 0). Makes sense, right? The more one struggles with past payments the less likely one is to be able to pay his or her debit. For more examples of using the SHAP algorithm, refer to .
Algorithmic bias has caused problems in many cases: Amazon’s internal hiring tool that penalised female candidates, and facial recognition software found to be accurate only for fair-skinned men
are just two examples.
Techniques such as SHAP, which explain the predictions produced by machine learning models, can greatly reduce the risks associated with algorithmic bias and increase fairness and transparency in decisions taken by AI tools. This will be immensely important, especially in critical domains like healthcare, banking and social media.
 M. Scott Lundberg, et al., “A Unified Approach to Interpreting Model Predictions“, Advances in Neural Information Processing Systems 30, pages 4765-4774, 2017.
 I-Cheng Yeh, et al., “The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients”, Expert Systems with Applications, Volume 36, Issue 2, Part 1, pages 2473-2480, 2009.
 SHAP (SHapley Additive exPlanations) official Github repository: https://github.com/slundberg/shap