knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE, fig.width = 5, fig.height = 4 )
This vignette shows the basic workflow of using SHAPforxgboost
for interpretation of models trained with XGBoost
, a hightly efficient gradient boosting implementation [@chen2016].
library("ggplot2") library("SHAPforxgboost") library("xgboost") set.seed(9375)
Let's train a small model to predict the first column in the iris data set, namely Sepal.Length
.
head(iris) X <- data.matrix(iris[, -1]) dtrain <- xgb.DMatrix(X, label = iris[[1]]) fit <- xgb.train( params = list( objective = "reg:squarederror", learning_rate = 0.1 ), data = dtrain, nrounds = 50 )
Now, we can prepare the SHAP values and analyze the results. All this in just very few lines of code!
# Crunch SHAP values shap <- shap.prep(fit, X_train = X) # SHAP importance plot shap.plot.summary(shap) # Alternatively, mean absolute SHAP values shap.plot.summary(shap, kind = "bar") # Dependence plots in decreasing order of importance # (colored by strongest interacting variable) for (x in shap.importance(shap, names_only = TRUE)) { p <- shap.plot.dependence( shap, x = x, color_feature = "auto", smooth = FALSE, jitter_width = 0.01, alpha = 0.4 ) + ggtitle(x) print(p) }
Note: print
is required only in the context of using ggplot
in rmarkdown
and for loop.
This is just a teaser: SHAPforxgboost
can do much more! Check out the README for much more information.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.