Black Box Explanations: Using LIME and SHAP in python

Question

Recently, I came across the paper Robust and Stable Black Box Explanations, which discusses a nice framework for global model-agnostic explanations.

I was thinking to recreate the experiments performed in the paper, but, unfortunately, the authors haven't provided the code. The summary of the experiments are:

use LIME, SHAP and MUSE as baseline models, and compute fidelity score on test data. (All the 3 datasets are used for classification problems)
since LIME and SHAP give local explanations, for a particular data point, the idea is to use K points from the training dataset, and create K explanations using LIME. LIME is supposed to return a local linear explanation. Now, for a new test data point, using the nearest point from K points used earlier and use the corresponding explanation to classify this new point.
measure the performance, using fidelity score (% of points for which $E(x) = B(x)$, where $E(x)$ is the explanation of the point and $B(x)$ is the classification of the point using the black box.

Now, the issue is, I am using LIME and SHAP packages in Python to achieve the results on baseline models.

However, I am not sure how I'll get a linear explanation for a point (one from the set K), and use it to classify a new test point in the neighborhood.

Every tutorial on YouTube and Medium discusses visualizing the explanation for a given point, but none talks about how to get the linear model itself and use it for newer points.

score 1 · Answer 1 · edited Jun 03 '21 at 23:25

1

For LIME, the local model that is trained can be found at lime.lime_base.explain_instance_with_data under the name "easy_model".

edited Jun 03 '21 at 23:25

Saurav Maheshkar

750
1
8
20

answered Jun 02 '21 at 10:58

Hajar

11
1

Black Box Explanations: Using LIME and SHAP in python

1 Answers1