Description Usage Arguments Details Value
Compare performance of several algorithms on the same data, with or without hyperparameter tuning
1 | compareAlgorithms(algorithms, task, tuning = FALSE, control = list())
|
algorithms |
Vector with the names of algorithms to be compared (same algorithms as in |
task |
Either one classification task for comparison using cross-validation or a list of tasks for comparisons across tasks (see Details). |
tuning |
Whether to tune the learners or not (default: |
control |
Optional list of settings (see Details). |
The comparison of algorithms differs depending on where a single classification task or multiple classification tasks are used. In the first approach, a repeated cross-validation scheme is used to partition the task into subsets multiple times, resulting in a comparison for each combination of subsets. If the algorithms are being tuned (which uses five-fold cross-validation), the resampling using for this tuning is nested within the training folds of the outer cross-validation scheme.
In the second approach, each learner is trained on each tasks (without resampling) and used to make
prediction on all other tasks. That is, if there are n
tasks, this will result in (n - 1)*n
predictions, performed with n
trained models.
Parallelization is always applied over the outermost loop for a given learner. That is, when comparing algorithms within one classification task, the parallization will be applied over the resampling iterations of the outer cross-validation scheme. When comparing across tasks, the parallelization will be applied over the tasks used for training the models.
The following settings can be passed to the control
argument:
folds
: Number of cross-validation folds used in the outer resampling scheme when comparing algorithms
within one task It has no effect when comparing algorithms across multiple tasks. Default: 5.
reps
: Number of repetitions of the cross-validation in the outer resampling scheme when comparing algorithms
within one task It has no effect when comparing algorithms across multiple tasks. Default: 3.
parallel
: Whether to use parallelization or not. Default: FALSE
.
nthreads
: Number of threads/workers to be used for parallelization. Default is the number of cores as reported
by parallel::detectCores()
.
maxiter
: Maximum number of iterations in the CMA-ES optimization of hyperparameters. Default: 10.
lambda
: Number of offspring in each iteration of the CMA-ES optimization of hyperparameters. Default: 10.
seed
: Random seed used for resampling schemes.
The result of the comparison, as an object of class mlr::BenchmarkResult.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.