autoxgboost: Fit and optimize a xgboost model.

Description Usage Arguments Value Examples

View source: R/autoxgboost.R

Description

An xgboost model is optimized based on a measure (see [Measure]). The bounds of the parameter in which the model is optimized, are defined by autoxgbparset. For the optimization itself bayesian optimization with mlrMBO is used. Without any specification of the control object, the optimizer runs for for 80 iterations or 1 hour, whatever happens first. Both the parameter set and the control object can be set by the user.

Usage

1
2
3
4
5
6
autoxgboost(task, measure = NULL, control = NULL, iterations = 160L,
  time.budget = 3600L, par.set = NULL, max.nrounds = 10^6,
  early.stopping.rounds = 10L, early.stopping.fraction = 4/5,
  build.final.model = TRUE, design.size = 15L,
  impact.encoding.boundary = 10L, mbo.learner = NULL, nthread = NULL,
  tune.threshold = TRUE)

Arguments

task

[Task]
The task.

measure

[Measure]
Performance measure. If NULL getDefaultMeasure is used.

control

[MBOControl]
Control object for optimizer. If not specified, the default makeMBOControl] object will be used with iterations maximum iterations and a maximum runtime of time.budget seconds.

iterations

[integer(1L]
Number of MBO iterations to do. Will be ignored if custom control is used. Default is 160.

time.budget

[integer(1L]
Time that can be used for tuning (in seconds). Will be ignored if custom control is used. Default is 3600, i.e., one hour.

par.set

[ParamSet]
Parameter set to tune over. Default is autoxgbparset.

max.nrounds

[integer(1)]
Maximum number of allowed boosting iterations. Default is 10^6.

early.stopping.rounds

[integer(1L]
After how many iterations without an improvement in the boosting OOB error should be stopped? Default is 10.

early.stopping.fraction

[numeric(1)]
What fraction of the data should be used for early stopping (i.e. as a validation set). Default is 4/5.

build.final.model

[logical(1)]
Should the model with the best found configuration be refitted on the complete dataset? Default is FALSE.

design.size

[integer(1)]
Size of the initial design. Default is 15L.

impact.encoding.boundary

[integer(1)]
Defines the threshold on how factor variables are handled. Factors with more levels than the "impact.encoding.boundary" get impact encoded while factor variables with less or equal levels than the "impact.encoding.boundary" get dummy encoded. For impact.encoding.boundary = 0L, all factor variables get impact encoded while for impact.encoding.boundary = .Machine$integer.max, all of them get dummy encoded. Default is 10.

mbo.learner

[Learner]
Regression learner from mlr, which is used as a surrogate to model our fitness function. If NULL (default), the default learner is determined as described here: mbo_default_learner.

nthread

[integer(1)]
Number of cores to use. If NULL (default), xgboost will determine internally how many cores to use.

tune.threshold

[logical(1)]
Should thresholds be tuned? This has only an effect for classification, see tuneThreshold. Default is TRUE.

Value

AutoxgbResult

Examples

1
2
3
4
5
iris.task = makeClassifTask(data = iris, target = "Species")
ctrl = makeMBOControl()
ctrl = setMBOControlTermination(ctrl, iters = 1L) #Speed up Tuning by only doing 1 iteration
res = autoxgboost(iris.task, control = ctrl, tune.threshold = FALSE)
res

ja-thomas/autoxgboost documentation built on April 9, 2020, 11:10 p.m.