knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
This package contains several surrogates that approximate the hyperparameter response surface for several interesting machine learing algorithms across several tasks.
For a continuation of this project see YAHPO GYM.
library(mlr3misc) library(mfsurrogates) library(data.table) dt = data.frame( instance = c("nb301", "lcbench", "rbv2_svm", "rbv2_rpart", "rbv2_aknn", "rbv2_glmnet", "rbv2_ranger", "rbv2_xgboost", "rbv2_super"), space = c("Cat+Dep", "Numeric", "Mix+Dep", "Mix", "Mix", "Mix", "Mix+Dep", "Mix+Dep", "Mix+Dep"), n_dims = c(34L, 7L, 6L, 5L, 6L, 3L, 8L, 14L, 38L), n_targets = c("2:perf(1)+rt", "6:perf(5)+rt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt", "6:perf(4)+rt+pt"), fidelity = c("epoch", "epoch", "trainsize+repl", "trainsize+repl", "trainsize+repl", "trainsize+repl", "trainsize+repl", "trainsize+repl", "trainsize+repl"), n_problems = c(1L, 35L, 96L, 101L, 99L, 98L, 114L, 109L, 89L), status = "ready" ) metrics = data.table(do.call("rbind", map(dt$instance, function(x) { x = fread(paste0(cfgs(x, workdir = workdir)$subdir, "surrogate_test_metrics.csv")) x = t(apply(keep(x, is.numeric), 2, range))[1:2, ] paste0(round(x[, 1], 2), "-", round(x[, 2], 2)) }) )) colnames(metrics) = c("Rsq", "Rho") dt = cbind(dt, metrics)[c(1, 2, 9, 3:8), ] knitr::kable(dt)
| |instance |space | n_dims|n_targets |fidelity | n_problems|status |Rsq |Rho | |:--|:------------|:-------|------:|:---------------|:--------------|----------:|:------|:---------|:---------| |1 |nb301 |Cat+Dep | 34|2:perf(1)+rt |epoch | 1|ready |0.83-0.98 |0.91-0.99 | |2 |lcbench |Numeric | 7|6:perf(5)+rt |epoch | 35|ready |0.98-1 |0.99-1 | |9 |rbv2_super |Mix+Dep | 38|6:perf(4)+rt+pt |trainsize+repl | 89|ready |0.86-0.99 |0.93-0.99 | |3 |rbv2_svm |Mix+Dep | 6|6:perf(4)+rt+pt |trainsize+repl | 96|ready |0.83-0.99 |0.91-1 | |4 |rbv2_rpart |Mix | 5|6:perf(4)+rt+pt |trainsize+repl | 101|ready |0.79-0.99 |0.89-1 | |5 |rbv2_aknn |Mix | 6|6:perf(4)+rt+pt |trainsize+repl | 99|ready |0.74-0.99 |0.86-1 | |6 |rbv2_glmnet |Mix | 3|6:perf(4)+rt+pt |trainsize+repl | 98|ready |0.91-1 |0.96-1 | |7 |rbv2_ranger |Mix+Dep | 8|6:perf(4)+rt+pt |trainsize+repl | 114|ready |0.85-1 |0.92-1 | |8 |rbv2_xgboost |Mix+Dep | 14|6:perf(4)+rt+pt |trainsize+repl | 109|ready |0.87-0.98 |0.93-0.99 |
where for n_targets (#number):
For more information either look directly into the code or use the object's printers via cfgs("...")
.
Toy test functions:
dt = data.frame( instance = c("Branin", "Shekel"), space = c("Num", "Num"), n_dims = c(2L, 4L), n_targets = c("1:y", "1:y"), fidelity = c("fidelity", "fidelity"), n_problems = c(1L, 1L), status = c("ready", "ready") ) knitr::kable(dt)
We first load the config:
library(checkmate) library(paradox) library(mfsurrogates) workdir = "../multifidelity_data/" cfg = cfgs("nb301", workdir = workdir) cfg$setup() # automatic download of files; necessary if you didn't download manually cfg
this config contains our objective
which we can use to optimize.
library(bbotk) library(data.table) ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
Since NASBench is only trained on CIFAR100
, we do not need to set a specific task_id
here.
We first load the config:
cfg = cfgs("lcbench", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
In practice, we need to subset to a task_id
if we want to tune
on a specific task.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(task = "126025"), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
A list of available task_id
s can be obtained from the param_set
:
cfg$param_set$params$OpenML_task_id$levels # or using the active binding across all configs: cfg$task_levels
If the configuration only has $1$ task, NULL
is returned.
We first load the config:
cfg = cfgs("branin") #cfg$plot() # or method = "rgl"
this config contains our objective
which we can use to optimize.
ins = OptimInstanceSingleCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("shekel")
this config contains our objective
which we can use to optimize.
ins = OptimInstanceSingleCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("rbv2_svm", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
Example to select target variables and a task:
objective = cfg$get_objective(task = "3", target_variables = c("mmce", "timetrain")) objective$codomain objective$constants
We first load the config:
cfg = cfgs("rbv2_rpart", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("rbv2_aknn", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("rbv2_glmnet", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("rbv2_ranger", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
We first load the config:
cfg = cfgs("rbv2_xgboost", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
The super learner is a learner that is parametrized as a choice over all available randombot base learners. It has a highly hierarchical, $38$ dimensional parameter space that includes the configurations of all baselearners.
We first load the config:
cfg = cfgs("rbv2_super", workdir = workdir) cfg$setup()
this config contains our objective
which we can use to optimize.
ins = OptimInstanceMultiCrit$new( objective = cfg$get_objective(), terminator = trm("evals", n_evals = 10L) ) opt("random_search")$optimize(ins)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.