buildExplainer: Step 1: Build an xgboostExplainer

Description Usage Arguments Value Examples

View source: R/buildExplainer.R

Description

This function outputs an xgboostExplainer (a data table that stores the feature impact breakdown for each leaf of each tree in an xgboost model). It is required as input into the explainPredictions and showWaterfall functions.

Usage

1
2
buildExplainer(xgb.model, trainingData, type = "binary", base_score = 0.5,
  trees_idx = NULL)

Arguments

xgb.model

A trained xgboost model

trainingData

A DMatrix of data used to train the model

type

The objective function of the model - either "binary" (for binary:logistic) or "regression" (for reg:linear)

base_score

Default 0.5. The base_score variable of the xgboost model.

trees_idx

Default NULL. An integer vector of tree indices that should be parsed. If set to NULL, all trees of the model are parsed.

Value

The XGBoost Explainer for the model. This is a data table where each row is a leaf of a tree in the xgboost model and each column is the impact of each feature on the prediction at the leaf.

The leaf and tree columns uniquely identify the node.

The sum of the other columns equals the prediction at the leaf (log-odds if binary response).

The 'intercept' column is identical for all rows and is analogous to the intercept term in a linear / logistic regression.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
library(xgboost)
library(xgboostExplainer)

set.seed(123)

data(agaricus.train, package='xgboost')

X = as.matrix(agaricus.train$data)
y = agaricus.train$label

train_idx = 1:5000

train.data = X[train_idx,]
test.data = X[-train_idx,]

xgb.train.data <- xgb.DMatrix(train.data, label = y[train_idx])
xgb.test.data <- xgb.DMatrix(test.data)

param <- list(objective = "binary:logistic")
xgb.model <- xgboost(param =param,  data = xgb.train.data, nrounds=3)

col_names = colnames(X)

pred.train = predict(xgb.model,X)
nodes.train = predict(xgb.model,X,predleaf =TRUE)
trees = xgb.model.dt.tree(col_names, model = xgb.model)

#### The XGBoost Explainer
explainer = buildExplainer(xgb.model,xgb.train.data, type="binary", base_score = 0.5, trees = NULL)
pred.breakdown = explainPredictions(xgb.model, explainer, xgb.test.data)

showWaterfall(xgb.model, explainer, xgb.test.data, test.data,  2, type = "binary")
showWaterfall(xgb.model, explainer, xgb.test.data, test.data,  8, type = "binary")

AppliedDataSciencePartners/xgboostExplainer documentation built on June 19, 2018, 12:24 p.m.