Description Usage Arguments Details
Wrapper to xgb.cv and xgb.train
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | xgbm(
formula,
data,
n.trees = 100,
interaction.depth = 6,
learning.rate = 0.1,
weight = NULL,
bag.fraction = 0.5,
col.fraction = 1,
cv.folds = 10,
cv.class.stratify = FALSE,
n.cores = NULL,
n.minobsinnode = 3,
leaf.penalty = 0,
weight.penalty.L1 = 0,
weight.penalty.L2 = 0,
early.stopping.trees = 100,
distribution = NULL,
quant = 0.5,
event = NULL,
verbose = 100,
fail.if.not.converged = TRUE
)
|
formula, data |
Formula and data.frame from which to create the model.matrix and response. |
n.trees |
Number of trees to build. |
interaction.depth, learning.rate |
Interaction depth (6) and learning rate (0.1). |
bag.fraction |
Proportion of data in each tree (0.5). If |
col.fraction |
Proportion of columns of features to use. Defaults to
|
cv.folds |
Number of cross-validation folds (10). |
cv.class.stratify |
Whether to stratify cross-validation by the response values (FALSE). |
n.cores |
Number of cores to use (1). |
n.minobsinnode |
Minimum number of observations allowed in a tree node (3). |
leaf.penalty |
Penalty factor for the total number of leaves in trees (0). |
weight.penalty.L1, weight.penalty.L2 |
L1 and L2 penalties for leaf weights (0 for L1 and 1 for L2). |
early.stopping.trees |
Passed through as |
distribution |
The only values allowed are, "gaussian", "huber" (which uses pseudo-huber loss), "binomial", "multinomial", "poisson", "quantile" or "coxph". Others should be added as the (my) need arises. There is no default because experience suggests that leads too easily to mistakes. |
quant |
Quantile to be modelled when |
event |
Only used when |
verbose |
Control printing (100). Use |
fail.if.not.converged |
Defaults to TRUE |
The function takes on the job of turning the data and formula into
the favoured stuff of xgboost
and applies sensible metrics given
the distribution: that is, it does maximum likelihood when a likelihood
function is available (everything but quantile regression).
The response (and any other) variable should be transformed prior to
using xgbm
, if necessary. An apparent bug, somewhere or other,
means that if the transformation is done via the formula, relative
influence goes wrong.
For distribution = "coxph"
, you need to use the event
argument.
For distribution = "poisson"
, if you need an offset, divide the
response by the exposure and pass exposure in using the weight argument.
Some of the code is quite inefficient and probably annoying. One of the reasons is that it's best to fail quickly rather than wait for a lot of processing to be done and then fail.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.