knitr::opts_chunk$set( collapse = FALSE, comment = "#>", fig.width = 7, fig.height = 3.5, warning = FALSE, message = FALSE )
We adress the problem of insuficient interpretability of explanations for domain experts. We solve this issue by introducing
describe() function, which automaticly generates natural language descriptions of explanations generated with
iBreakDown package allows for generating feature attribution explanations. Feature attribution explanations justify a model's prediction by showing which of the model's variables affect the prediction and to what extent. It is done by attaching to each variable an importance coefficient, which sum should approximate the model's prediction.
There are two methods used by
shap() generates a SHAP explanation, that is the function assigns Shapley values to each variable. Function
break_down uses break_down algorithm to generate an efficient approximation of the Shapley values. We show how to generate both explanations on an easy example using titanic data set and explainers from
First, we load the data set and build a random forest model classifying, which of the passangers survived sinking of the titanic. Then, using
DALEX package, we generate an explainer of the model. Lastly we select a random passanger, which prediction's should be explained.
library("DALEX") library("iBreakDown") library("randomForest") titanic <- na.omit(titanic) model_titanic_rf <- randomForest(survived == "yes" ~ ., data = titanic ) explain_titanic_rf <- explain(model_titanic_rf, data = titanic[,-9], y = titanic$survived == "yes", label = "Random Forest") passanger <- titanic[sample(nrow(titanic), 1) ,-9] passanger
Now we are ready for generating
bd_rf <- break_down(explain_titanic_rf, passanger, keep_distributions = TRUE) # distributions should be kept shap_rf <- shap(explain_titanic_rf, passanger) plot(bd_rf) plot(shap_rf)
The displayed explanations despite their visual clarity may not be interpretable for someone not familiar with iBreakDown or shap explanation. Therefore, we generate a simple natural language description for both explainers.
Natural language descriptions should be flexible enough to generate a description with a desired level of specificity and lenght. We describe the parameters used for describing both explanations. As both explanatios have the same parameters, we turn our attention to describe the iBreakDown explanation.
The nonsignificance treshold controls which predictions are close to the average prediction. By setting a higher value, more predictions will be described as close to the average model prediction and more variables will be described as nonsignificant.
describe(bd_rf, nonsignificance_treshold = 1)
Label of the prediction could be changed, to display more specific descriptions.
describe(bd_rf, label = "the passanger survived with probability")
Generating short descriptions can be useful, as they can make nice plot subtitles.
describe(bd_rf, short_description = TRUE)
Displaying variable values can easily make the description more informative.
describe(bd_rf, display_values = TRUE)
Displaying numbers changes the whole argumentation style making the description longer.
describe(bd_rf, display_numbers = TRUE)
Describing distribution details is useful if we want to have a big picture about other instance's behaviour.
describe(bd_rf, display_distribution_details = TRUE)
Explanations generated by
shap() functions have the same arguments expect from
display_shap what add an addition information, wheter the calculated variable's contributions have high or low variability.
describe(shap_rf, display_shap = TRUE)
Of course all the arguments can be set according to preferences allowing for flexible natural language descriptions.
describe(shap_rf, label = "the passanger survived with probability", display_values = TRUE, display_numbers = TRUE, display_shap = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.