describe: Natural language description of feature importance explainer

Description Usage Arguments Details References Examples

View source: R/describe_ceteris_paribus.R

Description

Generic function describe generates a natural language description of ceteris_paribus(), aggregated_profiles() and feature_importance() explanations what enchaces their interpretability.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## S3 method for class 'partial_dependence_explainer'
describe(
  x,
  nonsignificance_treshold = 0.15,
  ...,
  display_values = FALSE,
  display_numbers = FALSE,
  variables = NULL,
  label = "prediction"
)

describe(x, ...)

## S3 method for class 'ceteris_paribus_explainer'
describe(
  x,
  nonsignificance_treshold = 0.15,
  ...,
  display_values = FALSE,
  display_numbers = FALSE,
  variables = NULL,
  label = "prediction"
)

## S3 method for class 'feature_importance_explainer'
describe(x, nonsignificance_treshold = 0.15, ...)

Arguments

x

a ceteris paribus explanation produced with function ceteris_paribus()

nonsignificance_treshold

a parameter specifying a treshold for variable importance

...

other arguments

display_values

allows for displaying variable values

display_numbers

allows for displaying numerical values

variables

a character of a single variable name to be described

label

label for model's prediction

Details

Function describe.ceteris_paribus() generates a natural language description of ceteris paribus profile. The description summarizes variable values, that would change model's prediction at most. If a ceteris paribus profile for multiple variables is passed, variables must specify a single variable to be described. Works only for a ceteris paribus profile for one observation. In current version only categorical values are discribed. For display_numbers = TRUE three most important variable values are displayed, while display_numbers = FALSE displays all the important variables, however without further details.

Function describe.ceteris_paribus() generates a natural language description of ceteris paribus profile. The description summarizes variable values, that would change model's prediction at most. If a ceteris paribus profile for multiple variables is passed, variables must specify a single variable to be described. Works only for a ceteris paribus profile for one observation. For display_numbers = TRUE three most important variable values are displayed, while display_numbers = FALSE displays all the important variables, however without further details.

Function describe.feature_importance_explainer() generates a natural language description of feature importance explanation. It prints the number of important variables, that have significant dropout difference from the full model, depending on nonsignificance_treshold. The description prints the three most important variables for the model's prediction. The current design of DALEX explainer does not allow for displaying variables values.

References

Explanatory Model Analysis. Explore, Explain, and Examine Predictive Models. https://ema.drwhy.ai/

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
library("DALEX")
library("ingredients")
library("ranger")


model_titanic_rf <- ranger(survived ~., data = titanic_imputed, probability = TRUE)

explain_titanic_rf <- explain(model_titanic_rf,
                              data = titanic_imputed[,-8],
                              y = titanic_imputed[,8],
                              label = "ranger forest",
                              verbose = FALSE)

selected_passangers <- select_sample(titanic_imputed, n = 10)
cp_rf <- ceteris_paribus(explain_titanic_rf, selected_passangers)
pdp <- aggregate_profiles(cp_rf, type = "partial", variable_type = "categorical")
describe(pdp, variables = "gender")


library("DALEX")
library("ingredients")
library("ranger")


model_titanic_rf <- ranger(survived ~.,  data = titanic_imputed, probability = TRUE)

explain_titanic_rf <- explain(model_titanic_rf,
                              data = titanic_imputed[,-8],
                              y = titanic_imputed[,8],
                              label = "ranger forest",
                              verbose = FALSE)

selected_passanger <- select_sample(titanic_imputed, n = 1, seed = 123)
cp_rf <- ceteris_paribus(explain_titanic_rf, selected_passanger)

plot(cp_rf, variable_type = "categorical")
describe(cp_rf, variables = "class", label = "the predicted probability")

library("DALEX")
library("ingredients")

lm_model <- lm(m2.price~., data = apartments)
explainer_lm <- explain(lm_model, data = apartments[,-1], y = apartments[,1])

fi_lm <- feature_importance(explainer_lm, loss_function = DALEX::loss_root_mean_square)

plot(fi_lm)
describe(fi_lm)

ingredients documentation built on April 10, 2021, 5:06 p.m.