| vi_shap | R Documentation | 
Compute SHAP-based VI scores for the predictors in a model. See details below.
vi_shap(object, ...)
## Default S3 method:
vi_shap(object, feature_names = NULL, train = NULL, ...)
| object | A fitted model object (e.g., a randomForest object). | 
| ... | Additional arguments to be passed on to  | 
| feature_names | Character string giving the names of the predictor
variables (i.e., features) of interest. If  | 
| train | A matrix-like R object (e.g., a data frame or matrix)
containing the training data. If  | 
This approach to computing VI scores is based on the mean absolute value of the SHAP values for each feature; see, for example, https://github.com/shap/shap and the references therein.
Strumbelj, E., and Kononenko, I. Explaining prediction models and individual predictions with feature contributions. Knowledge and information systems 41.3 (2014): 647-665.
A tidy data frame (i.e., a tibble object) with two columns:
Variable - the corresponding feature name;
Importance - the associated importance, computed as the mean absolute
Shapley value.
## Not run: 
library(ggplot2)  # for theme_light() function
library(xgboost)
# Simulate training data
trn <- gen_friedman(500, sigma = 1, seed = 101)  # ?vip::gen_friedman
# Feature matrix
X <- data.matrix(subset(trn, select = -y))  # matrix of feature values
# Fit an XGBoost model; hyperparameters were tuned using 5-fold CV
set.seed(859)  # for reproducibility
bst <- xgboost(X, label = trn$y, nrounds = 338, max_depth = 3, eta = 0.1,
               verbose = 0)
# Construct VIP using "exact" SHAP values from XGBoost's internal Tree SHAP
# functionality
vip(bst, method = "shap", train = X, exact = TRUE, include_type = TRUE,
    geom = "point", horizontal = FALSE,
    aesthetics = list(color = "forestgreen", shape = 17, size = 5)) +
  theme_light()
# Use Monte-Carlo approach, which works for any model; requires prediction
# wrapper
pfun_prob <- function(object, newdata) {  # prediction wrapper
  # For Shapley explanations, this should ALWAYS return a numeric vector
  predict(object, newdata = newdata, type = "prob")[, "yes"]
}
# Compute Shapley-based VI scores
set.seed(853)  # for reproducibility
vi_shap(rfo, train = subset(t1, select = -survived), pred_wrapper = pfun_prob,
        nsim = 30)
## # A tibble: 5 × 2
## Variable Importance
##   <chr>         <dbl>
## 1 pclass       0.104
## 2 age          0.0649
## 3 sex          0.272
## 4 sibsp        0.0260
## 5 parch        0.0291
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.