shapviz: Initialize "shapviz" Object

View source: R/shapviz.R

shapvizR Documentation

Initialize "shapviz" Object

Description

This function creates an object of class "shapviz" from a matrix of SHAP values, or from a fitted model of type

  • XGBoost,

  • LightGBM, or

  • H2O (tree-based regression or binary classification model).

Furthermore, shapviz() can digest the results of

  • fastshap::explain(),

  • shapr::explain(),

  • treeshap::treeshap(),

  • DALEX::predict_parts(),

  • kernelshap::kernelshap(),

  • kernelshap::permshap(), and

  • kernelshap::additive_shap(),

check the vignettes for examples.

Usage

shapviz(object, ...)

## Default S3 method:
shapviz(object, ...)

## S3 method for class 'matrix'
shapviz(object, X, baseline = 0, collapse = NULL, S_inter = NULL, ...)

## S3 method for class 'xgb.Booster'
shapviz(
  object,
  X_pred,
  X = X_pred,
  which_class = NULL,
  collapse = NULL,
  interactions = FALSE,
  ...
)

## S3 method for class 'lgb.Booster'
shapviz(object, X_pred, X = X_pred, which_class = NULL, collapse = NULL, ...)

## S3 method for class 'explain'
shapviz(object, X = NULL, baseline = NULL, collapse = NULL, ...)

## S3 method for class 'treeshap'
shapviz(
  object,
  X = object[["observations"]],
  baseline = 0,
  collapse = NULL,
  ...
)

## S3 method for class 'predict_parts'
shapviz(object, ...)

## S3 method for class 'shapr'
shapviz(object, X = object[["x_test"]], collapse = NULL, ...)

## S3 method for class 'kernelshap'
shapviz(object, X = object[["X"]], which_class = NULL, collapse = NULL, ...)

## S3 method for class 'H2ORegressionModel'
shapviz(object, X_pred, X = as.data.frame(X_pred), collapse = NULL, ...)

## S3 method for class 'H2OBinomialModel'
shapviz(object, X_pred, X = as.data.frame(X_pred), collapse = NULL, ...)

## S3 method for class 'H2OModel'
shapviz(object, X_pred, X = as.data.frame(X_pred), collapse = NULL, ...)

Arguments

object

For XGBoost, LightGBM, and H2O, this is the fitted model used to calculate SHAP values from X_pred. In the other cases, it is the object containing the SHAP values.

...

Parameters passed to other methods (currently only used by the predict() functions of XGBoost, LightGBM, and H2O).

X

Matrix or data.frame of feature values used for visualization. Must contain at least the same column names as the SHAP matrix represented by object/X_pred (after optionally collapsing some of the SHAP columns).

baseline

Optional baseline value, representing the average response at the scale of the SHAP values. It will be used for plot methods that explain single predictions.

collapse

A named list of character vectors. Each vector specifies the feature names whose SHAP values need to be summed up. The names determine the resulting collapsed column/dimension names.

S_inter

Optional 3D array of SHAP interaction values. If object has shape n x p, then S_inter needs to be of shape n x p x p. Summation over the second (or third) dimension should yield the usual SHAP values. Furthermore, dimensions 2 and 3 are expected to be symmetric. Default is NULL.

X_pred

Data set as expected by the predict() function of XGBoost, LightGBM, or H2O. For XGBoost, a matrix or xgb.DMatrix, for LightGBM a matrix, and for H2O a data.frame or an H2OFrame. Only used for XGBoost, LightGBM, or H2O objects.

which_class

In case of a multiclass or multioutput setting, which class/output (>= 1) to explain. Currently relevant for XGBoost, LightGBM, kernelshap, and permshap.

interactions

Should SHAP interactions be calculated (default is FALSE)? Only available for XGBoost.

Details

Together with the main input, a data set X of feature values is required, used only for visualization. It can therefore contain character or factor variables, even if the SHAP values were calculated from a purely numerical feature matrix. In addition, to improve visualization, it can sometimes be useful to truncate gross outliers, logarithmize certain columns, or replace missing values with an explicit value.

SHAP values of dummy variables can be combined using the convenient collapse argument. Multi-output models created from XGBoost, LightGBM, "kernelshap", or "permshap" return a "mshapviz" object, containing a "shapviz" object per output.

Value

An object of class "shapviz" with the following elements:

  • S: Numeric matrix of SHAP values.

  • X: data.frame containing the feature values corresponding to S.

  • baseline: Baseline value, representing the average prediction at the scale of the SHAP values.

  • S_inter: Numeric array of SHAP interaction values (or NULL).

Methods (by class)

  • shapviz(default): Default method to initialize a "shapviz" object.

  • shapviz(matrix): Creates a "shapviz" object from a matrix of SHAP values.

  • shapviz(xgb.Booster): Creates a "shapviz" object from an XGBoost model.

  • shapviz(lgb.Booster): Creates a "shapviz" object from a LightGBM model.

  • shapviz(explain): Creates a "shapviz" object from fastshap::explain().

  • shapviz(treeshap): Creates a "shapviz" object from treeshap::treeshap().

  • shapviz(predict_parts): Creates a "shapviz" object from DALEX::predict_parts().

  • shapviz(shapr): Creates a "shapviz" object from shapr::explain().

  • shapviz(kernelshap): Creates a "shapviz" object from an object of class 'kernelshap'. This includes results of kernelshap(), permshap(), and additive_shap().

  • shapviz(H2ORegressionModel): Creates a "shapviz" object from a (tree-based) H2O regression model.

  • shapviz(H2OBinomialModel): Creates a "shapviz" object from a (tree-based) H2O binary classification model.

  • shapviz(H2OModel): Creates a "shapviz" object from a (tree-based) H2O model (base class).

See Also

sv_importance(), sv_dependence(), sv_dependence2D(), sv_interaction(), sv_waterfall(), sv_force(), collapse_shap()

Examples

S <- matrix(c(1, -1, -1, 1), ncol = 2, dimnames = list(NULL, c("x", "y")))
X <- data.frame(x = c("a", "b"), y = c(100, 10))
shapviz(S, X, baseline = 4)
# XGBoost models
X_pred <- data.matrix(iris[, -1])
dtrain <- xgboost::xgb.DMatrix(X_pred, label = iris[, 1], nthread = 1)
fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1)

# Will use numeric matrix "X_pred" as feature matrix
x <- shapviz(fit, X_pred = X_pred)
x
sv_dependence(x, "Species")

# Will use original values as feature matrix
x <- shapviz(fit, X_pred = X_pred, X = iris)
sv_dependence(x, "Species")

# "X_pred" can also be passed as xgb.DMatrix, but only if X is passed as well!
x <- shapviz(fit, X_pred = dtrain, X = iris)

# Multiclass setting
params <- list(objective = "multi:softprob", num_class = 3)
X_pred <- data.matrix(iris[, -5])
dtrain <- xgboost::xgb.DMatrix(
  X_pred, label = as.integer(iris[, 5]) - 1, nthread = 1
)
fit <- xgboost::xgb.train(params = params, data = dtrain, nrounds = 10, nthread = 1)

# Select specific class
x <- shapviz(fit, X_pred = X_pred, which_class = 3)
x

# Or combine all classes to "mshapviz" object
x <- shapviz(fit, X_pred = X_pred)
x

# What if we would have one-hot-encoded values and want to explain the original column?
X_pred <- stats::model.matrix(~ . -1, iris[, -1])
dtrain <- xgboost::xgb.DMatrix(X_pred, label = as.integer(iris[, 1]), nthread = 1)
fit <- xgboost::xgb.train(data = dtrain, nrounds = 10, nthread = 1)
x <- shapviz(
  fit,
  X_pred = X_pred,
  X = iris,
  collapse = list(Species = c("Speciessetosa", "Speciesversicolor", "Speciesvirginica"))
)
summary(x)

# Similarly with LightGBM
if (requireNamespace("lightgbm", quietly = TRUE)) {
  fit <- lightgbm::lgb.train(
    params = list(objective = "regression", num_thread = 1),
    data = lightgbm::lgb.Dataset(X_pred, label = iris[, 1]),
    nrounds = 10,
    verbose = -2
  )

  x <- shapviz(fit, X_pred = X_pred)
  x

  # Multiclass
  params <- list(objective = "multiclass", num_class = 3, num_thread = 1)
  X_pred <- data.matrix(iris[, -5])
  dtrain <- lightgbm::lgb.Dataset(X_pred, label = as.integer(iris[, 5]) - 1)
  fit <- lightgbm::lgb.train(params = params, data = dtrain, nrounds = 10)

  # Select specific class
  x <- shapviz(fit, X_pred = X_pred, which_class = 3)
  x

  # Or combine all classes to a "mshapviz" object
  mx <- shapviz(fit, X_pred = X_pred)
  mx
  all.equal(mx[[3]], x)
}

shapviz documentation built on Sept. 14, 2024, 5:07 p.m.