Model Explanation

Measures how important each variable is for your model predictions
Sample model explanations with contribution from each variable

What are we trying to explain?

This explanation evaluates the importance of each of the variables considered by the model.

Remember that EXPAI also allows you to explain how the model works on a limited and meaningful subgroup of your data.

Why is it useful?

Information from this explanation can be used for different purposes.

For business

  • Model validation by experts: knowing the most important variables for a prediction may help experts validate whether the model works as expected.

  • Knowledge generation: this information may help humans discover unknown relevant features in the process.

  • Process optimization: getting to know how your model works will help you identify useless variables and focus your resources in the most relevant features. This knowledge will no longer come from intuition but from data.

For developers

  • Feature selection: variables which aren't relevant for the prediction can be removed from the data to increase efficiency.

  • Models comparison: discovering how different models behave on the same data may help understand which one gets closer to the expected behaviour.

How we do it

This explanation is explained in detail in Fisher et al. (2019) work. In this section, we sum it up so that anyone can understand the idea behind our algorithms.

Plain English

The intuition is quite simple. How does performance change if a variable is removed from our data? To measure this effect, we randomly permute the values for the variable so that they no longer match their samples.

What we expect, as presented by Breiman (2001a), is that after replacing the values for this variable, the performance of the model decreases. The higher the decrease, the more important the variable is.

If there is no effect after permuting the values, this means that this variable has no effect at all since predictions are correct even after removing its effect.

More formally

Let:

  • f(X)f(X)be the model we are trying to explain.

  • XX be the matrix containing input data for the model.

  • ZZ be the variable whose impact we want to compute at the z-th column of X.

  • L(X,y,y^)L(X, y, \hat{y}) be the loss function used to measure our model performance given XX, the ground-truth target yy and the prediction y^\hat{y}done by the model.

Procedure:

  1. Execute the model on dataset XX to obtain y^\hat{y}.

  2. Compute the loss L0=L(X,y,y^)L^0 = L(X, y, \hat{y}) for this prediction.

  3. Generate XX' by permuting the z-th column of XXcontaining variable ZZ

  4. Execute the model on dataset XX' to obtain y^\hat{y}'.

  5. Compute the loss L=L(X,y,y^)L' = L(X', y, \hat{y}') for the prediction after permutation.

  6. Measure the importance of variable ZZ by computing:

    • Difference: L0LL^0 - L'

    • Ratio: L0/LL^0/L'

  7. Once importance for all variables is computed, we sum them up and calculate the relative importance (%) for each variable.

Notice that since permutation is a random process, slightly different results might be obtained for each execution.

To ensure robust results, permutation is performed 10 times and results are averaged.

References

Fisher, Aaron, Cynthia Rudin, and Francesca Dominici. 2019. “All Models Are Wrong, but Many Are Useful: Learning a Variable’s Importance by Studying an Entire Class of Prediction Models Simultaneously.” Journal of Machine Learning Research 20 (177): 1–81. http://jmlr.org/papers/v20/18-760.html.

Breiman, Leo. 2001a. “Random Forests.” Machine Learning 45: 5–32. https://doi.org/10.1023/a:1010933404324