| resample | R Documentation |
Runs a resampling (possibly in parallel):
Repeatedly apply Learner learner on a training set of Task task to train a model,
then use the trained model to predict observations of a test set.
Training and test sets are defined by the Resampling resampling.
resample(
task,
learner,
resampling,
store_models = FALSE,
store_backends = TRUE,
encapsulate = NA_character_,
allow_hotstart = FALSE,
clone = c("task", "learner", "resampling"),
unmarshal = TRUE,
callbacks = NULL
)
task |
(Task). |
learner |
(Learner). |
resampling |
(Resampling). |
store_models |
( |
store_backends |
( |
encapsulate |
( |
allow_hotstart |
( |
clone |
( |
unmarshal |
|
callbacks |
(List of mlr3misc::Callback) |
ResampleResult.
Note that uninstantiated Resamplings are instantiated on the task, making
the procedure stochastic even in case of a deterministic learner.
If you want to compare the performance of a learner on the training with the performance
on the test set, you have to configure the Learner to predict on multiple sets by
setting the field predict_sets to c("train", "test") (default is "test").
Each set yields a separate Prediction object during resampling.
In the next step, you have to configure the measures to operate on the respective Prediction object:
m1 = msr("classif.ce", id = "ce.train", predict_sets = "train")
m2 = msr("classif.ce", id = "ce.test", predict_sets = "test")
The (list of) created measures can finally be passed to $aggregate() or $score().
This function can be parallelized with the future or mirai package.
One job is one resampling iteration.
All jobs are send to an apply function from future.apply or mirai::mirai_map() in a single batch.
To select a parallel backend, use future::plan().
To use mirai, call mirai::daemons(.compute = "mlr3_parallelization") before calling this function.
The future package guarantees reproducible results independent of the parallel backend.
The results of mirai will not be the same but can be made reproducible by setting a seed when calling mirai::daemons().
More on parallelization can be found in the book:
https://mlr3book.mlr-org.com/chapters/chapter10/advanced_technical_aspects_of_mlr3.html
This function supports progress bars via the package progressr.
Simply wrap the function call in progressr::with_progress() to enable them.
Alternatively, call progressr::handlers() with global = TRUE to enable progress bars
globally.
We recommend the progress package as backend which can be enabled with
progressr::handlers("progress").
The mlr3 uses the lgr package for logging.
lgr supports multiple log levels which can be queried with
getOption("lgr.log_levels").
To suppress output and reduce verbosity, you can lower the log from the
default level "info" to "warn":
lgr::get_logger("mlr3")$set_threshold("warn")
To get additional log output for debugging, increase the log level to "debug"
or "trace":
lgr::get_logger("mlr3")$set_threshold("debug")
To log to a file or a data base, see the documentation of lgr::lgr-package.
The fitted models are discarded after the predictions have been computed in order to reduce memory consumption.
If you need access to the models for later analysis, set store_models to TRUE.
as_benchmark_result() to convert to a BenchmarkResult.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter3/evaluation_and_benchmarking.html#sec-resampling
Package mlr3viz for some generic visualizations.
Other resample:
ResampleResult
task = tsk("penguins")
learner = lrn("classif.rpart")
resampling = rsmp("cv")
# Explicitly instantiate the resampling for this task for reproduciblity
set.seed(123)
resampling$instantiate(task)
rr = resample(task, learner, resampling)
print(rr)
# Retrieve performance
rr$score(msr("classif.ce"))
rr$aggregate(msr("classif.ce"))
# merged prediction objects of all resampling iterations
pred = rr$prediction()
pred$confusion
# Repeat resampling with featureless learner
rr_featureless = resample(task, lrn("classif.featureless"), resampling)
# Convert results to BenchmarkResult, then combine them
bmr1 = as_benchmark_result(rr)
bmr2 = as_benchmark_result(rr_featureless)
print(bmr1$combine(bmr2))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.