| predict.xgb.Booster | R Documentation |
Predict values on data based on XGBoost model.
## S3 method for class 'xgb.Booster'
predict(
object,
newdata,
missing = NA,
outputmargin = FALSE,
predleaf = FALSE,
predcontrib = FALSE,
approxcontrib = FALSE,
predinteraction = FALSE,
training = FALSE,
iterationrange = NULL,
strict_shape = FALSE,
avoid_transpose = FALSE,
validate_features = FALSE,
base_margin = NULL,
...
)
object |
Object of class |
newdata |
Takes For single-row predictions on sparse data, it is recommended to use CSR format. If passing a sparse vector, it will take it as a row vector. Note that, for repeated predictions on the same data, one might want to create a DMatrix to pass here instead of passing R types like matrices or data frames, as predictions will be faster on DMatrix. If
|
missing |
Float value that represents missing values in data (e.g., 0 or some other extreme value). This parameter is not used when |
outputmargin |
Whether the prediction should be returned in the form of
original untransformed sum of predictions from boosting iterations' results.
E.g., setting |
predleaf |
Whether to predict per-tree leaf indices. |
predcontrib |
Whether to return feature contributions to individual predictions (see Details). |
approxcontrib |
Whether to use a fast approximation for feature contributions (see Details). |
predinteraction |
Whether to return contributions of feature interactions to individual predictions (see Details). |
training |
Whether the prediction result is used for training. For dart booster, training predicting will perform dropout. |
iterationrange |
Sequence of rounds/iterations from the model to use for prediction, specified by passing
a two-dimensional vector with the start and end numbers in the sequence (same format as R's For example, passing If passing If passing "all", will use all of the rounds regardless of whether the model had early stopping or not. Not applicable to |
strict_shape |
Whether to always return an array with the same dimensions for the given prediction mode regardless of the model type - meaning that, for example, both a multi-class and a binary classification model would generate output arrays with the same number of dimensions, with the 'class' dimension having size equal to '1' for the binary model. If passing See documentation for the return type for the exact shape of the output arrays for each prediction mode. |
avoid_transpose |
Whether to output the resulting predictions in the same memory layout in which they are generated by the core XGBoost library, without transposing them to match the expected output shape. Internally, XGBoost uses row-major order for the predictions it generates, while R arrays use column-major order, hence the result needs to be transposed in order to have the expected shape when represented as an R array or matrix, which might be a slow operation. If passing |
validate_features |
When If the column names differ and If the booster has feature types and If passing Note that this check might add some sizable latency to the predictions, so it's recommended to disable it for performance-sensitive applications. |
base_margin |
Base margin used for boosting from existing model (raw score that gets added to all observations independently of the trees in the model). If supplied, should be either a vector with length equal to the number of rows in Note that, if |
... |
Not used. |
Note that iterationrange would currently do nothing for predictions from "gblinear",
since "gblinear" doesn't keep its boosting history.
One possible practical applications of the predleaf option is to use the model
as a generator of new features which capture non-linearity and interactions,
e.g., as implemented in xgb.create.features().
Setting predcontrib = TRUE allows to calculate contributions of each feature to
individual predictions. For "gblinear" booster, feature contributions are simply linear terms
(feature_beta * feature_value). For "gbtree" booster, feature contributions are SHAP
values (Lundberg 2017) that sum to the difference between the expected output
of the model and the current prediction (where the hessian weights are used to compute the expectations).
Setting approxcontrib = TRUE approximates these values following the idea explained
in http://blog.datadive.net/interpreting-random-forests/.
With predinteraction = TRUE, SHAP values of contributions of interaction of each pair of features
are computed. Note that this operation might be rather expensive in terms of compute and memory.
Since it quadratically depends on the number of features, it is recommended to perform selection
of the most important features first. See below about the format of the returned results.
The predict() method uses as many threads as defined in xgb.Booster object (all by default).
If you want to change their number, assign a new number to nthread using xgb.model.parameters<-().
Note that converting a matrix to xgb.DMatrix() uses multiple threads too.
A numeric vector or array, with corresponding dimensions depending on the prediction mode and on
parameter strict_shape as follows:
If passing strict_shape=FALSE:
For regression or binary classification: a vector of length nrows.
For multi-class and multi-target objectives: a matrix of dimensions [nrows, ngroups].
Note that objective variant multi:softmax defaults towards predicting most likely class (a vector
nrows) instead of per-class probabilities.
For predleaf: a matrix with one column per tree.
For multi-class / multi-target, they will be arranged so that columns in the output will have
the leafs from one group followed by leafs of the other group (e.g. order will be group1:feat1,
group1:feat2, ..., group2:feat1, group2:feat2, ...).
If there is more than one parallel tree (e.g. random forests), the parallel trees will be the last grouping in the resulting order, which will still be 2D.
For predcontrib: when not multi-class / multi-target, a matrix with dimensions
[nrows, nfeats+1]. The last "+ 1" column corresponds to the baseline value.
For multi-class and multi-target objectives, will be an array with dimensions [nrows, ngroups, nfeats+1].
The contribution values are on the scale of untransformed margin (e.g., for binary classification, the values are log-odds deviations from the baseline).
For predinteraction: when not multi-class / multi-target, the output is a 3D array of
dimensions [nrows, nfeats+1, nfeats+1]. The off-diagonal (in the last two dimensions)
elements represent different feature interaction contributions. The array is symmetric w.r.t. the last
two dimensions. The "+ 1" columns corresponds to the baselines. Summing this array along the last
dimension should produce practically the same result as predcontrib = TRUE.
For multi-class and multi-target, will be a 4D array with dimensions [nrows, ngroups, nfeats+1, nfeats+1]
If passing strict_shape=TRUE, the result is always a matrix (if 2D) or array (if 3D or higher):
For normal predictions, the dimension is [nrows, ngroups].
For predcontrib=TRUE, the dimension is [nrows, ngroups, nfeats+1].
For predinteraction=TRUE, the dimension is [nrows, ngroups, nfeats+1, nfeats+1].
For predleaf=TRUE, the dimension is [nrows, niter, ngroups, num_parallel_tree].
If passing avoid_transpose=TRUE, then the dimensions in all cases will be in reverse order - for
example, for predinteraction, they will be [nfeats+1, nfeats+1, ngroups, nrows]
instead of [nrows, ngroups, nfeats+1, nfeats+1].
Scott M. Lundberg, Su-In Lee, "A Unified Approach to Interpreting Model Predictions", NIPS Proceedings 2017, https://arxiv.org/abs/1705.07874
Scott M. Lundberg, Su-In Lee, "Consistent feature attribution for tree ensembles", https://arxiv.org/abs/1706.06060
xgb.train()
## binary classification:
data(agaricus.train, package = "xgboost")
data(agaricus.test, package = "xgboost")
## Keep the number of threads to 2 for examples
nthread <- 2
data.table::setDTthreads(nthread)
train <- agaricus.train
test <- agaricus.test
bst <- xgb.train(
data = xgb.DMatrix(train$data, label = train$label, nthread = 1),
nrounds = 5,
params = xgb.params(
max_depth = 2,
nthread = nthread,
objective = "binary:logistic"
)
)
# use all trees by default
pred <- predict(bst, test$data)
# use only the 1st tree
pred1 <- predict(bst, test$data, iterationrange = c(1, 1))
# Predicting tree leafs:
# the result is an nsamples X ntrees matrix
pred_leaf <- predict(bst, test$data, predleaf = TRUE)
str(pred_leaf)
# Predicting feature contributions to predictions:
# the result is an nsamples X (nfeatures + 1) matrix
pred_contr <- predict(bst, test$data, predcontrib = TRUE)
str(pred_contr)
# verify that contributions' sums are equal to log-odds of predictions (up to float precision):
summary(rowSums(pred_contr) - qlogis(pred))
# for the 1st record, let's inspect its features that had non-zero contribution to prediction:
contr1 <- pred_contr[1,]
contr1 <- contr1[-length(contr1)] # drop intercept
contr1 <- contr1[contr1 != 0] # drop non-contributing features
contr1 <- contr1[order(abs(contr1))] # order by contribution magnitude
old_mar <- par("mar")
par(mar = old_mar + c(0,7,0,0))
barplot(contr1, horiz = TRUE, las = 2, xlab = "contribution to prediction in log-odds")
par(mar = old_mar)
## multiclass classification in iris dataset:
lb <- as.numeric(iris$Species) - 1
num_class <- 3
set.seed(11)
bst <- xgb.train(
data = xgb.DMatrix(as.matrix(iris[, -5], nthread = 1), label = lb),
nrounds = 10,
params = xgb.params(
max_depth = 4,
nthread = 2,
subsample = 0.5,
objective = "multi:softprob",
num_class = num_class
)
)
# predict for softmax returns num_class probability numbers per case:
pred <- predict(bst, as.matrix(iris[, -5]))
str(pred)
# convert the probabilities to softmax labels
pred_labels <- max.col(pred) - 1
# the following should result in the same error as seen in the last iteration
sum(pred_labels != lb) / length(lb)
# compare with predictions from softmax:
set.seed(11)
bst <- xgb.train(
data = xgb.DMatrix(as.matrix(iris[, -5], nthread = 1), label = lb),
nrounds = 10,
params = xgb.params(
max_depth = 4,
nthread = 2,
subsample = 0.5,
objective = "multi:softmax",
num_class = num_class
)
)
pred <- predict(bst, as.matrix(iris[, -5]))
str(pred)
all.equal(pred, pred_labels)
# prediction from using only 5 iterations should result
# in the same error as seen in iteration 5:
pred5 <- predict(bst, as.matrix(iris[, -5]), iterationrange = c(1, 5))
sum(pred5 != lb) / length(lb)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.