| xgb.Callback | R Documentation |
Constructor for defining the structure of callback functions that can be executed at different stages of model training (before / after training, before / after each boosting iteration).
xgb.Callback(
cb_name = "custom_callback",
env = new.env(),
f_before_training = function(env, model, data, evals, begin_iteration, end_iteration)
NULL,
f_before_iter = function(env, model, data, evals, iteration) NULL,
f_after_iter = function(env, model, data, evals, iteration, iter_feval) NULL,
f_after_training = function(env, model, data, evals, iteration, final_feval,
prev_cb_res) NULL
)
cb_name |
Name for the callback. If the callback produces some non-NULL result (from executing the function passed under
Names of callbacks must be unique - i.e. there cannot be two callbacks with the same name. |
env |
An environment object that will be passed to the different functions in the callback. Note that this environment will not be shared with other callbacks. |
f_before_training |
A function that will be executed before the training has started. If passing If passing a function, it will be called with parameters supplied as non-named arguments matching the function signatures that are shown in the default value for each function argument. |
f_before_iter |
A function that will be executed before each boosting round. This function can signal whether the training should be finalized or not, by outputting
a value that evaluates to Return values of |
f_after_iter |
A function that will be executed after each boosting round. This function can signal whether the training should be finalized or not, by outputting
a value that evaluates to Return values of |
f_after_training |
A function that will be executed after training is finished. This function can optionally output something non-NULL, which will become part of the R
attributes of the booster (assuming one passes |
Arguments that will be passed to the supplied functions are as follows:
env The same environment that is passed under argument env.
It may be modified by the functions in order to e.g. keep tracking of what happens across iterations or similar.
This environment is only used by the functions supplied to the callback, and will
not be kept after the model fitting function terminates (see parameter f_after_training).
model The booster object when using xgb.train(), or the folds when using xgb.cv().
For xgb.cv(), folds are a list with a structure as follows:
dtrain: The training data for the fold (as an xgb.DMatrix object).
bst: Rhe xgb.Booster object for the fold.
evals: A list containing two DMatrices, with names train and test
(test is the held-out data for the fold).
index: The indices of the hold-out data for that fold (base-1 indexing),
from which the test entry in evals was obtained.
This object should not be in-place modified in ways that conflict with the training (e.g. resetting the parameters for a training update in a way that resets the number of rounds to zero in order to overwrite rounds).
Note that any R attributes that are assigned to the booster during the callback functions,
will not be kept thereafter as the booster object variable is not re-assigned during
training. It is however possible to set C-level attributes of the booster through
xgb.attr() or xgb.attributes(), which should remain available for the rest
of the iterations and after the training is done.
For keeping variables across iterations, it's recommended to use env instead.
data The data to which the model is being fit, as an xgb.DMatrix object.
Note that, for xgb.cv(), this will be the full data, while data for the specific
folds can be found in the model object.
evals The evaluation data, as passed under argument evals to xgb.train().
For xgb.cv(), this will always be NULL.
begin_iteration Index of the first boosting iteration that will be executed (base-1 indexing).
This will typically be '1', but when using training continuation, depending on the parameters for updates, boosting rounds will be continued from where the previous model ended, in which case this will be larger than 1.
end_iteration Index of the last boostign iteration that will be executed (base-1 indexing, inclusive of this end).
It should match with argument nrounds passed to xgb.train() or xgb.cv().
Note that boosting might be interrupted before reaching this last iteration, for
example by using the early stopping callback xgb.cb.early.stop().
iteration Index of the iteration number that is being executed (first iteration
will be the same as parameter begin_iteration, then next one will add +1, and so on).
iter_feval Evaluation metrics for evals that were supplied, either
determined by the objective, or by parameter custom_metric.
For xgb.train(), this will be a named vector with one entry per element in
evals, where the names are determined as 'evals name' + '-' + 'metric name' - for
example, if evals contains an entry named "tr" and the metric is "rmse",
this will be a one-element vector with name "tr-rmse".
For xgb.cv(), this will be a 2d matrix with dimensions [length(evals), nfolds],
where the row names will follow the same naming logic as the one-dimensional vector
that is passed in xgb.train().
Note that, internally, the built-in callbacks such as xgb.cb.print.evaluation summarize this table by calculating the row-wise means and standard deviations.
final_feval The evaluation results after the last boosting round is executed
(same format as iter_feval, and will be the exact same input as passed under
iter_feval to the last round that is executed during model fitting).
prev_cb_res Result from a previous run of a callback sharing the same name
(as given by parameter cb_name) when conducting training continuation, if there
was any in the booster R attributes.
Sometimes, one might want to append the new results to the previous one, and this will be done automatically by the built-in callbacks such as xgb.cb.evaluation.log, which will append the new rows to the previous table.
If no such previous callback result is available (which it never will when fitting
a model from start instead of updating an existing model), this will be NULL.
For xgb.cv(), which doesn't support training continuation, this will always be NULL.
The following names (cb_name values) are reserved for internal callbacks:
print_evaluation
evaluation_log
reset_parameters
early_stop
save_model
cv_predict
gblinear_history
The following names are reserved for other non-callback attributes:
names
class
call
params
niter
nfeatures
folds
When using the built-in early stopping callback (xgb.cb.early.stop), said callback will always be executed before the others, as it sets some booster C-level attributes that other callbacks might also use. Otherwise, the order of execution will match with the order in which the callbacks are passed to the model fitting function.
An xgb.Callback object, which can be passed to xgb.train() or xgb.cv().
Built-in callbacks:
xgb.cb.print.evaluation
xgb.cb.evaluation.log
xgb.cb.reset.parameters
xgb.cb.early.stop
xgb.cb.save.model
xgb.cb.cv.predict
xgb.cb.gblinear.history
# Example constructing a custom callback that calculates
# squared error on the training data (no separate test set),
# and outputs the per-iteration results.
ssq_callback <- xgb.Callback(
cb_name = "ssq",
f_before_training = function(env, model, data, evals,
begin_iteration, end_iteration) {
# A vector to keep track of a number at each iteration
env$logs <- rep(NA_real_, end_iteration - begin_iteration + 1)
},
f_after_iter = function(env, model, data, evals, iteration, iter_feval) {
# This calculates the sum of squared errors on the training data.
# Note that this can be better done by passing an 'evals' entry,
# but this demonstrates a way in which callbacks can be structured.
pred <- predict(model, data)
err <- pred - getinfo(data, "label")
sq_err <- sum(err^2)
env$logs[iteration] <- sq_err
cat(
sprintf(
"Squared error at iteration %d: %.2f\n",
iteration, sq_err
)
)
# A return value of 'TRUE' here would signal to finalize the training
return(FALSE)
},
f_after_training = function(env, model, data, evals, iteration,
final_feval, prev_cb_res) {
return(env$logs)
}
)
data(mtcars)
y <- mtcars$mpg
x <- as.matrix(mtcars[, -1])
dm <- xgb.DMatrix(x, label = y, nthread = 1)
model <- xgb.train(
data = dm,
params = xgb.params(objective = "reg:squarederror", nthread = 1),
nrounds = 5,
callbacks = list(ssq_callback)
)
# Result from 'f_after_iter' will be available as an attribute
attributes(model)$ssq
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.