knitr::opts_chunk$set( collapse = TRUE, comment = "#>" # fig.path = "Readme_files/" ) library(compboost)
[Compboost] contains two mlr3
learners for regression regr.compboost
and binary classification classif.compboost
.
See https://mlr3.mlr-org.com/ For an introduction to mlr3
.
Here, we show the two learners in small examples.
As task, we use the Boston housing task that is accessible via tsk("boston_housing")
:
library(mlr3) task = tsk("boston_housing") task
The key regr.compboost
gives the regression learner:
lcb = lrn("regr.compboost") lcb$param_set lcb$train(task) lcb$model
The most important features of Compboost
can be controlled via the parameters.
For example, using early stopping requires to set the value oob_fraction
to a number bigger than 0.
Just in this case, the learner can be trained with early stopping:
lcb = lrn("regr.compboost", early_stop = TRUE) lcb$train(task) lcb = lrn("regr.compboost", oob_fraction = 0.3, early_stop = TRUE) lcb$train(task) head(lcb$model$logs)
Binary classification works in the same way. We use the spam
data set for the demo:
task = tsk("spam") task
Then, the usual methods and fields are accessible:
lcb = lrn("classif.compboost", iterations = 500L) lcb$train(task) lcb$predict_type = "prob" pred = lcb$predict(task) pred$confusion pred$score(msr("classif.auc"))
The parallel execution in compboost
is controlled by the optimizers.
With mlr3
, optimizers can defined in the construction of the learner.
Thus, if compboost should be run in parallel, define an optimizer in advance and use it in the construction:
lcb$timings["train"] lcb_2c = lrn("classif.compboost", iterations = 500L, optimizer = OptimizerCoordinateDescent$new(2)) lcb_2c$train(task) lcb_2c$timings["train"]
As for the parallel execution, losses can be defined by the loss
parameter value in the construction:
task = tsk("boston_housing") lcb_quantiles = lrn("regr.compboost", loss = LossQuantile$new(0.1)) lcb_quantiles$train(task) lcb_quantiles$predict(task)
Interactions can be added in the constructor by specifying a data.frame
with columns feat1
and feat2
.
For each row, one row-wise tensor product base learner is added to the model:
task = tsk("german_credit") ints = data.frame(feat1 = c("age", "amount"), feat2 = c("job", "duration")) ints set.seed(31415) l = lrn("classif.compboost", interactions = ints) l$train(task) l$importance() plotTensor(l$model, "amount_duration_tensor")
Early stopping is also controlled by the constructor. Use early_stop = TRUE
to use
early stopping with the default values patience = 5
and eps_for_break = 0
(see ?LoggerOobRisk
).
In compboost
, early stopping requires a validation set and hence to set oob_fraction > 0
:
task = tsk("mtcars") set.seed(314) l = lrn("regr.compboost", early_stop = TRUE, oob_fraction = 0.3, iterations = 1000) l$train(task) plotRisk(l$model)
A more aggressive early stopping is achieved by setting patience = 1
:
set.seed(314) l = lrn("regr.compboost", early_stop = TRUE, oob_fraction = 0.3, iterations = 1000, patience = 1) l$train(task) plotRisk(l$model)
Though, this is not recommended as it can stop too early without reaching the best validation risk.
Note that oob_fraction > 0
must be true to use early stopping:
l = lrn("regr.compboost", early_stop = TRUE) l$train(task)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.