| shap.plot.dependence | R Documentation |
This function by default makes a simple dependence plot with feature values
on the x-axis and SHAP values on the y-axis, optional to color by another
feature. It is optional to use a different variable for SHAP values on the
y-axis, and color the points by the feature value of a designated variable.
Not colored if color_feature is not supplied. If data_int (the
SHAP interaction values dataset) is supplied, it will plot the interaction
effect between y and x on the y-axis. Dependence plot is easy
to make if you have the SHAP values dataset from predict.xgb.Booster
or predict.lgb.Booster.
It is not necessary to start with the long format data, but since that is
used for the summary plot, we just continue to use it here.
shap.plot.dependence(
data_long,
x,
y = NULL,
color_feature = NULL,
data_int = NULL,
dilute = FALSE,
smooth = TRUE,
size0 = NULL,
add_hist = FALSE,
add_stat_cor = FALSE,
alpha = NULL,
jitter_height = 0,
jitter_width = 0,
...
)
data_long |
the long format SHAP values from |
x |
which feature to show on x-axis, it will plot the feature value |
y |
which shap values to show on y-axis, it will plot the SHAP value of that feature. y is default to x, if y is not provided, just plot the SHAP values of x on the y-axis |
color_feature |
which feature value to use for coloring, color by the feature value. If "auto", will select the feature "c" minimizing the variance of the shap value given x and c, which can be viewed as a heuristic for the strongest interaction. |
data_int |
the 3-dimention SHAP interaction values array. if |
dilute |
a number or logical, dafault to TRUE, will plot
|
smooth |
optional to add a loess smooth line, default to TRUE. |
size0 |
point size, default to 1 if nobs<1000, 0.4 if nobs>1000 |
add_hist |
whether to add histogram using |
add_stat_cor |
add correlation and p-value from |
alpha |
point transparancy, default to 1 if nobs<1000 else 0.6 |
jitter_height |
amount of vertical jitter (see hight in |
jitter_width |
amount of horizontal jitter (see width in |
... |
additional parameters passed to |
be default a ggplot2 object, based on which you could add more geom
layers.
# **SHAP dependence plot**
# 1. simple dependence plot with SHAP values of x on the y axis
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
add_hist = TRUE, add_stat_cor = TRUE)
# 2. can choose a different SHAP values on the y axis
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width")
# 3. color by another feature's feature values
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
color_feature = "Petal.Width")
# 4. choose 3 different variables for x, y, and color
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width", color_feature = "Petal.Width")
# Optional to add hist or remove smooth line, optional to plot fewer data (make plot quicker)
shap.plot.dependence(data_long = shap_long_iris, x="Petal.Length",
y = "Petal.Width", color_feature = "Petal.Width",
add_hist = TRUE, smooth = FALSE, dilute = 3)
# to make a list of plot
plot_list <- lapply(names(iris)[2:3], shap.plot.dependence, data_long = shap_long_iris)
# **SHAP interaction effect plot **
# To get the interaction SHAP dataset for plotting, need to get `shap_int` first:
mod1 = xgboost::xgboost(
data = as.matrix(iris[,-5]), label = iris$Species,
gamma = 0, eta = 1, lambda = 0,nrounds = 1, verbose = FALSE, nthread = 1)
# Use either:
data_int <- shap.prep.interaction(xgb_mod = mod1,
X_train = as.matrix(iris[,-5]))
# or:
shap_int <- predict(mod1, as.matrix(iris[,-5]),
predinteraction = TRUE)
# if data_int is supplied, y axis will plot the interaction values of y (vs. x)
shap.plot.dependence(data_long = shap_long_iris,
data_int = shap_int_iris,
x="Petal.Length",
y = "Petal.Width",
color_feature = "Petal.Width")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.