Data Science at Microsoft
Photo by DeepMind on Unsplash .
[1] Introduction
Machine Learning (ML) models are used across a wide range of applications ranging from product analytics, building new capabilities, analyzing existing limitations, identifying anomalies, credit lending, motivating business actions, and more. In many applications, ML models are even being used for high-stakes decisions such as cancer detection, vaccine production, and industrial manufacturing. Understanding the outcome, along with how the outcome is reached, becomes important in these scenarios. In today’s data-driven world, ML models are sometimes viewed as something of an “oracle,” providing insights by correlating data points that may otherwise seem hard to fathom. As these models have become a core component in motivating our prioritization and decision-making process, there is a strong need to try and understand the reason why a model makes its predictions.
The information about how a model learns the underlying patterns within the dataset to make accurate predictions is embedded within the “weights” of the hidden layers. These distributions of seemingly meaningless numbers in the hidden layers can be quite complex to interpret. Most often, even the model developers do not fully understand why a model makes certain predictions, feeling that if the overall accuracy is good, it can be pushed to production. This knowledge gap poses additional challenges when accounting for edge cases or preventing adversaries from exploiting the model. Understanding the contribution of each weight toward the output prediction can give us valuable insights into root causes, which can then be used to make meaningful changes over time.
For example, doctors relying on a model predicting the severity of COVID-19 in a patient may want to know which of the factors influence the outcome so that they can preemptively act in a timely manner to prevent the high-risk, high-influence factors from elevating to dangerous levels. In this case, the reasoning behind the decision is more important than the outputs of the model itself.
On a similar note, banks using a model to determine the risk score associated with a user applying for a loan typically want to know why some users are tagged as being risky so that they do not lose out on potential customers. Furthermore, they may also want to understand why a user is tagged as being “safe” as an aid to critically analyzing the model’s reasoning to help mitigate a potential default or even a potentially fraudulent scenario.
[2] What is model explainability?
Model explainability is the process of analyzing and surfacing the inner workings of a Machine Learning model or other “black box” algorithms to make them more transparent. This is done by looking at the contribution of each of the data points (or features) toward the outputs (also known as feature importance) or by visualizing different views of the data or model weights for further review. Model explainability aims to answer the question “Why is the model making this prediction?”
[3] Why is model explainability important?
Developing models that are easily explainable can help in a variety of ways:
Determining whether there is unconscious bias in the model while it is making certain predictions. Identifying bias is crucial when developing solutions that are intended to be rolled out to millions of users all over the world. Some of the many scenarios that can negatively affect the end user experience include looking at the composition of data considered while training the model, the presence of heavy outliers skewing the outputs, the lack of sufficient representation of a cohort, or cherry-picking the optimal subset of training data.
Better explaining model results and building trust with stakeholders. Understanding which features contribute positively and negatively toward a prediction can help end users get better visibility and build trust in these “black box” algorithms. This also helps to surface limitations or assumptions made during the model build phase so that users aren’t potentially misled with false claims of a model’s prowess.
Helping identify unforeseen factors affecting the outputs of an ML model. In the case of large datasets, the model learns to correlate various data points and evaluate them against the loss function to make a prediction. There might exist certain correlations within the dataset that aren’t apparent to the developer, but which can be crucial for the application. Surfacing these during the model development phase enables better debugging capabilities and building fair solutions.
Helping validate a model and monitoring the impact of its decisions on humans. This is especially important during audits by regulatory authorities to determine whether a model is safe to be deployed across multiple users around the world.
In this two-part article series, we present a centralized, model-agnostic explainability framework supporting a Bring-Your-Own-Model (BYOM) architecture. We review some explainability techniques we’ve implemented, our approach toward viewing important model metrics and dataset explorations, and cover the various adoption techniques available for those who want to leverage this work for their own use cases.
[4] A proposed explainability framework
In our first version of an explainability framework, we focus on explanation techniques for classification and regression models trained on tabular datasets. Our framework integrates various local and global explanation techniques as a one-stop shop experience where users can leverage outputs from multiple perspectives with minimal additional work. This eliminates the need to write specific code for each technique, as users can view outputs from different algorithms using our common interface. Our centralized model-agnostic architecture also allows us to easily scale to include new explanation techniques (such as CEM explainers) or explanations for models such as Deep Neural Networks (DNN) for Natural Language Processing (NLP) and forecasting applications in the future. Our framework enables:
A straightforward and efficient way to compute local and global feature importance using differing techniques.
The ability to visualize outputs in a consistent, understandable manner that helps make them simple to consume and helps drive actionable insights.
Ease of scaling to include new state-of-the-art techniques and a centralized source of maintenance.
Straightforward integration with existing production pipelines.
Ease of model adoption.
Figure 1 summarizes the two main explainability workflows and their respective outputs in our proposed framework.
Figure 1: Proposed Explainability Framework architecture.
[4.1] Framework inputs
Our framework takes in two inputs:
The trained model, usually in the form of a pickle file. This can be an off-the-shelf model (such as sklearn or lightGBM, among others) or a custom model built for specific applications.
The training dataset, which is the dataframe containing the training data in the form of a .parquet, .csv, or .tsv file.
These inputs can be sourced from a storage container or passed as arguments during runtime depending on the adoption method.
[4.2] Explanation techniques
Once the trained model and training dataset are available, users have the option to choose between two different explainability tracks:
Track 1 (shown in green): Utilizes state-of-the-art model-agnostic explainability techniques to create global and local explanations.
Track 2 (shown in blue): Leverages the Azure Interpret ML package to perform its analysis and output an interactive dashboard in addition to local and global explanation plots and values.
The main difference between these tracks is the “interactive dashboard” output, which allows users to perform “what if” analysis by changing various feature points and distributions and then observing the effect on the output variable.
[4.3] Framework outputs
We take a detailed look at each of the implemented explainability techniques and their outputs in the following section. The reference code contains pre-trained models and datasets that you can use for quick implementation.
[4.3.1] SHAP explanations
SHAP (SHapley Additive exPlanations) is an approach rooted in game theory for explaining the output of a Machine Learning model. It connects optimal credit allocation with local explanations using the classic Shapley values from various extensions of game theory to compute importance. SHAP assigns each feature an “importance” value for a particular prediction. The general idea is that the most relevant features are the ones that result in the largest changes to the model outputs when removed. We apply SHAP-based analysis from both local and global perspectives to provide a comprehensive view. Its main advantages include:
Identification of a new class of additive feature-importance measures.
Theoretical results depict a unique solution in each class with a set of desirable properties.
Figure 2: Global explainability (beeswarm) visualization depicting the top contributors for the model.
The global SHAP analysis outputs are highlighted in Figure 2. This example provides an overview of the top contributors in the model’s prediction. In certain business scenarios we may be interested in understanding the actions to be taken for each data point. In this case, local SHAP analysis can be leveraged to explain the key features that decide each row’s prediction. This is shown in Figure 3.
Figure 3: Local explainability visualization depicting the importance of a specific feature.
The following example shows the working of the SHAP explainer. Here we look at the SpeedDating dataset in which participants were asked whether they would like to see their date again. They were also asked to rate their date on six attributes: Attractiveness, Sincerity, Intelligence, Fun, Ambition, and Shared Interests. The dataset also includes questionnaire data gathered from participants at different points in the process.
import flamlfrom flaml.data import load_openml_dataset# Retrieve openML speed dating datasetX_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=40536, data_dir=’./’)
This dataset contains 59 numeric features, 61 categorical features, and two classes: “1” and “0”. Let’s train a classification model to look at the explanation outputs.
from lightgbm import LGBMClassifier# Train classification modellgbm_model = LGBMClassifier()lgbm_model.fit(X_train, y_train)
One of the challenges with SHAP lies in its inability to process categorical and text-based features. To use SHAP on categorical features, we need to encode them into a numerical format. We can leverage the preprocessing function of “sklearn” to ease this process.
def encode_features(Data): # Function to encode categorical features le = preprocessing.LabelEncoder() # Define list of categorical columns cat_list = [col for col in Data.columns.tolist() if Data[col].dtype == ‘category’] # Transform required columns for col in cat_list: Data[col] = le.fit_transform(Data[col]) return DataX_train = encode_features(X_train)X_test = encode_features(X_test)
We initialize the SHAP explainer (tree explainer in this case) by providing the model as a parameter and later calculating the Shapley values by providing the training data.
import numpy as npimport shapexplainer = shap.TreeExplainer(lgbm_model)shap_values = explainer.shap_values(X_train)
Finally, we can use the SHAP explainer as a local explainer to look at the features that contributed toward the prediction of a single datapoint. For example, let’s look at the first row of the dataset.
shap.force_plot(explainer.expected_value[0], shap_values[0][0], X_train.iloc[0,:])
Figure 4: Local explanations of the model’s prediction using SHAP explainer.
SHAP highlights the features that contributed to the data point’s prediction as shown in Figure 4. The explanation above shows features each contributing to push the model output from the base value (the average model output over the training dataset we passed) to the model output. Features pushing the prediction higher are shown in red, and those pushing the prediction lower are in blue.
shap.summary_plot(shap_values)
SHAP also provides global importance of features as a summarized result. From the following plot in Figure 5, we can identify features such as “like”, “attractive_o”, and “attractive_partner” as important global features.
Figure 5: Summarized global explanations of the model’s prediction using SHAP explainer.
This tutorial provides additional examples on using SHAP on different kinds of datasets.
[4.3.2] FastSHAP explanations
One of the main challenges with using SHAP is its expensive computation. To address this issue, FastSHAP was built to accelerate the execution by supporting parallel computing. FastSHAP is an amortized approach for calculating Shapley value estimates for large datasets. It involves training an explainer model to output Shapley value estimates in a single forward pass. As a comparison, parallel computing is not enabled in the SHAP package, except for cases when interpreting XGBoost, LightGBM, and CatBoost models, where the SHAP package internally leverages the TreeSHAP functions in the packages. The FastSHAP outputs are identical to the outputs generated by the SHAP algorithm.
The following example shows the working of the FastSHAP explainer. Here we look at the diabetes dataset, which predicts whether a woman has diabetes or not based on characteristic features.
import flamlfrom flaml.data import load_openml_dataset# Retrieve openML speed dating datasetX_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=37, data_dir=’./’)
This dataset contains eight numeric features and two classes: “tested_positive” and “tested_negative”. Let’s train a classification model to look at the explanation outputs.
from lightgbm import LGBMClassifier# Train classification modellgbm_model = LGBMClassifier()lgbm_model.fit(X_train, y_train)
We initialize the FastSHAP explainer (tree explainer in this case) by providing the model and algorithm as parameters. The algorithm argument specifies the TreeSHAP algorithm used to run FastTreeSHAP. It can take values “v0”, “v1”, “v2”, or “auto”, and its default value is “auto”. We later calculate the fast Shapley values on the entire test dataset.
import fasttreeshapexplainer = fasttreeshap.TreeExplainer(lgbm_model, algorithm = “auto”, n_jobs = -1)shap_values = explainer(X_test).values
To further extract fast Shapley values for individual rows we query “shap_values” by row index.
idx = 0output = shap_values[idx]
This tutorial provides additional examples of using FastSHAP on different kinds of datasets.
[4.3.3] LIME explanations
LIME (Local Interpretable Model-agnostic Explanations) is a state-of-the-art local explainability method that identifies the importance of features for class prediction using the linear approximation method. For each datapoint, it creates multiple test points around the original datapoint in dataspace by perturbing it. It then fits a linear model (for example, a linear regression model) on the datapoints and then identifies the feature importance of this linear model as local explainability of the original datapoint. It is a model-agnostic explainability method as it requires only the prediction function of the model and not the internal details of it.
The following example shows the working of the Lime explainer. Here we look at the Credit-G dataset that classifies people described by a set of features as good or bad credit risks.
import flamlfrom flaml.data import load_openml_dataset# Retrieve openML credit-g datasetX_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=31, data_dir=’./’)
This dataset contains seven numeric features, 14 categorical features, and two classes: “good” and “bad”. Let’s train a classification model to look at the explanation outputs.
from lightgbm import LGBMClassifier# Train classification modellgbm_model = LGBMClassifier()lgbm_model.fit(X_train, y_train)
One of the challenges with LIME lies in its inability to process categorical and text-based features. To use Lime on categorical features, we need to encode them into a numerical format. We can leverage the preprocessing function of Sklearn to ease this process.
def encode_features(Data): # Function to encode categorical features le = preprocessing.LabelEncoder() # Define list of categorical columns cat_list = [col for col in Data.columns.tolist() if Data[col].dtype == ‘category’] # Transform required columns for col in cat_list: Data[col] = le.fit_transform(Data[col]) return DataX_train = encode_features(X_train)X_test = encode_features(X_test)
We initialize the LIME explainer (tabular explainer in this case) by providing the training data, feature names, class names, and the mode (either classification or regression).
import numpy as npfrom lime import lime_tabularlime_explainer = lime_tabular.LimeTabularExplainer( training_data=np.array(X_train), feature_names=X_train.columns, class_names=lgbm_model._classes, mode=’classification’)
Finally, we can use the LIME explainer to look at the features that contributed toward the prediction of a single datapoint. For example, let’s look at row 76 of the test dataset. Note: predict_fn would expect the “model.predict” function instead of “model.predict_proba” in case of regression tasks.
lime_results = lime_explainer.explain_instance( data_row=X_test.iloc[75], predict_fn=lgbm_model.predict_proba)lime_results.show_in_notebook(show_table=True)
Figure 6: Local explanations of the model’s predictions using LIME explainer.
LIME highlights the features that contributed to the data point’s prediction as shown in Figure 6. The features around the customer’s checking status, strong credit, duration of engagement, employment, and so on contributed positively toward them being considered a good customer with respect to securing the loan. On the other hand, their savings history and installment commitment decreased this probability.
This tutorial provides additional examples on using LIME on different kinds of datasets.
[4.3.4] Anchors explanations
Anchors is a high-precision model-agnostic explanation technique. It provides local and sufficient features for model prediction. In other words, the features identified as anchors are sufficient for guaranteeing the class prediction with high probability. A change in anchor features may lead to a change in class prediction. However, changes to the rest of the features apart from anchors do not matter and will not affect the class prediction. Anchors are higher human-precision explanations than linear explanations and require less effort to understand.
The following example provides an implementation overview of the Anchors method. Here we use the famous Adult dataset that classifies whether a person would make over $50,000 per year or not based on features around their age, education, work experience, relationship, and capital.
import flamlfrom flaml.data import load_openml_dataset# Retrieve openML adult datasetX_train, X_test, y_train, y_test = load_openml_dataset(dataset_id=1590, data_dir=’./’)
This dataset contains six numeric and nine categorical features and two classes: “>50K” and “