| irb.train | R Documentation |
Fit a predictive model with the iteratively reweighted convex optimization (IRCO) that minimizes the robust loss functions in the CC-family (concave-convex). The convex optimization is conducted by functional descent boosting algorithm in the R package xgboost. The iteratively reweighted boosting (IRBoost) algorithm reduces the weight of the observation that leads to a large loss; it also provides weights to help identify outliers. Applications include the robust generalized
linear models and extensions, where the mean is related to the predictors by boosting, and robust accelerated failure time models. irb.train is an advanced interface for training an irboost model. The irboost function is a simpler wrapper for irb.train. See xgboost::xgb.train.
irb.train(
params = list(),
data,
z_init = NULL,
cfun = "ccave",
s = 1,
delta = 0.1,
iter = 10,
nrounds = 100,
del = 1e-10,
trace = FALSE,
...
)
params |
the list of parameters,
|
data |
training dataset. |
z_init |
vector of nobs with initial convex component values, must be non-negative with default values = weights if data has provided, otherwise z_init = vector of 1s |
cfun |
concave component of CC-family, can be |
s |
tuning parameter of |
delta |
a small positive number provided by user only if |
iter |
number of iteration in the IRCO algorithm |
nrounds |
boosting iterations within each IRCO iteration |
del |
convergency criteria in the IRCO algorithm, no relation to |
trace |
if |
... |
other arguments passing to |
An object with S3 class xgb.train with the additional elments:
weight_update_log a matrix of nobs row by iter column of observation weights in each iteration of the IRCO algorithm
weight_update a vector of observation weights in the last IRCO iteration that produces the final model fit
loss_log sum of loss value of the composite function in each IRCO iteration. Note, cfun requires objective non-negative in some cases. Thus care must be taken. For instance, with objective="reg:gamma", the loss value is defined by gamma-nloglik - (1+log(min(y))), where y=label. The second term is introduced such that the loss value is non-negative. In fact, gamma-nloglik=y/ypre + log(ypre) in the xgboost::xgb.train, where ypre is the mean prediction value, can
be negative. It can be derived that for fixed y, the minimum value of gamma-nloglik is achived at ypre=y, or 1+log(y). Thus, among all label values, the minimum of gamma-nloglik is 1+log(min(y)).
Zhu Wang
Maintainer: Zhu Wang zwang145@uthsc.edu
Wang, Zhu (2021), Unified Robust Boosting, Journal of Data Science (2024), 1-19, DOI 10.6339/24-JDS1138
## Not run:
Sys.setenv(
OMP_NUM_THREADS = "1",
OMP_THREAD_LIMIT = "1",
OPENBLAS_NUM_THREADS = "1",
MKL_NUM_THREADS = "1",
VECLIB_MAXIMUM_THREADS = "1",
BLIS_NUM_THREADS = "1"
)
# logistic boosting
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
dtrain <- with(agaricus.train, xgboost::xgb.DMatrix(data, label = label, nthread = 1))
dtest <- with(agaricus.test, xgboost::xgb.DMatrix(data, label = label, nthread = 1))
watchlist <- list(train = dtrain, eval = dtest)
# A simple irb.train example:
param <- list(max_depth = 2, eta = 1, nthread = 1,
objective = "binary:logitraw", eval_metric = "auc")
bst <- xgboost::xgb.train(params=param, data=dtrain, nrounds = 2,
watchlist=watchlist, verbose=2)
bst <- irb.train(params=param, data=dtrain, nrounds = 2)
summary(bst$weight_update)
# a bug in xgboost::xgb.train
#bst <- irb.train(params=param, data=dtrain, nrounds = 2,
# watchlist=watchlist, trace=TRUE, verbose=2)
# time-to-event analysis
X <- matrix(1:5, ncol=1)
# Associate ranged labels with the data matrix.
# This example shows each kind of censored labels.
# uncensored right left interval
y_lower = c(10, 15, -Inf, 30, 100)
y_upper = c(Inf, Inf, 20, 50, Inf)
dtrain <- xgboost::xgb.DMatrix(data=X, label_lower_bound=y_lower,
label_upper_bound=y_upper)
param <- list(objective="survival:aft", nthread=1, aft_loss_distribution="normal",
aft_loss_distribution_scale=1, max_depth=3, min_child_weight=0)
watchlist <- list(train = dtrain)
bst <- xgboost::xgb.train(params=param, data=dtrain, nrounds=15,
watchlist=watchlist)
predict(bst, dtrain)
bst_cc <- irb.train(params=param, data=dtrain, nrounds=15, cfun="hcave",
s=1.5, trace=TRUE, verbose=0)
bst_cc$weight_update
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.