knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE, fig.width = 5, fig.height = 4 )
This vignette shows the basic workflow of using SHAPforxgboost for interpretation of models trained with XGBoost, a hightly efficient gradient boosting implementation [@chen2016].
library("ggplot2") library("SHAPforxgboost") library("xgboost") set.seed(9375)
Let's train a small model to predict the first column in the iris data set, namely Sepal.Length.
head(iris) X <- data.matrix(iris[, -1]) dtrain <- xgb.DMatrix(X, label = iris[[1]]) fit <- xgb.train( params = list( objective = "reg:squarederror", learning_rate = 0.1 ), data = dtrain, nrounds = 50 )
Now, we can prepare the SHAP values and analyze the results. All this in just very few lines of code!
# Crunch SHAP values shap <- shap.prep(fit, X_train = X) # SHAP importance plot shap.plot.summary(shap) # Alternatively, mean absolute SHAP values shap.plot.summary(shap, kind = "bar") # Dependence plots in decreasing order of importance # (colored by strongest interacting variable) for (x in shap.importance(shap, names_only = TRUE)) { p <- shap.plot.dependence( shap, x = x, color_feature = "auto", smooth = FALSE, jitter_width = 0.01, alpha = 0.4 ) + ggtitle(x) print(p) }
Note: print is required only in the context of using ggplot in rmarkdown and for loop.
This is just a teaser: SHAPforxgboost can do much more! Check out the README for much more information.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.