BenchmarkResult: Container for Results of 'benchmark()'
In mllg/mlr3: Machine Learning in R - Next Generation

Description Format Construction Fields Methods S3 Methods Examples

This is the result container object returned by benchmark(). A BenchmarkResult consists of the data row-binded data of multiple ResampleResults, which can easily be re-constructed.

Note that all stored objects are accessed by reference. Do not modify any object without cloning it first.

R6::R6Class object.

1	bmr = BenchmarkResult$new(data = data.table())

data :: data.table::data.table()
Table with data for one resampling iteration per row: Task, Learner, Resampling, iteration (integer(1)), Prediction, and the unique hash uhash (character(1)) of the corresponding ResampleResult. Additional columns are kept in the resulting object.

data :: data.table::data.table()
Internal data storage with one row per resampling iteration. Can be joined with $rr_data by joining on column "hash". We discourage users to directly work with this table.

Package develops on the other hand may opt to add additional columns here. These columns are preserved in all mutators.
rr_data :: data.table::data.table()
Internal data storage with one row per ResampleResult. Can be joined with $data by joining on column "hash". Not used in mlr3 directly, but can be exploited by add-on packages.

Package develops may opt to add additional columns here. These columns are preserved in all mutators.
task_type :: character(1)
Task type of objects in the BenchmarkResult. All stored objects (Task, Learner, Prediction) in a single BenchmarkResult are required to have the same task type, e.g., "classif" or "regr".
tasks :: data.table::data.table()
Table of used tasks with three columns: "task_hash" (character(1)), "task_id" (character(1)) and "task" (Task).
learners :: data.table::data.table()
Table of used learners with three columns: "learner_hash" (character(1)), "learner_id" (character(1)) and "learner" (Learner).
resamplings :: data.table::data.table()
Table of used resamplings with three columns: "resampling_hash" (character(1)), "resampling_id" (character(1)) and "resampling" (Resampling).
n_resample_results :: integer(1)
Returns the number of stored ResampleResults.
uhashes :: character()
Vector of unique hashes of all included ResampleResults.

aggregate(measures = NULL, ids = TRUE, uhashes = FALSE, params = FALSE, conditions = FALSE)
(list of Measure, logical(1), logical(1), logical(1), logical(1)) -> data.table::data.table()
Returns a result table where resampling iterations are combined into ResampleResults. A column with the aggregated performance score is added for each Measure, named with the id of the respective measure.

For convenience, the following parameters can be set to extract more information from the returned ResampleResult:
- uhashes :: logical(1)
  Adds the uhash values of the ResampleResult as extra character column "uhash".
- ids :: logical(1)
  Adds object ids ("task_id", "learner_id", "resampling_id") as extra character columns.
- params :: logical(1)
  Adds the hyperparameter values as extra list column "params". You can unnest them with mlr3misc::unnest().
- conditions :: logical(1)
  Adds the number of resampling iterations with at least one warning as extra integer column "warnings", and the number of resampling iterations with errors as extra integer column "errors".
score(measures = NULL, ids = TRUE)
(list of Measure, logical(1)) -> data.table::data.table()
Returns a table with one row for each resampling iteration, including all involved objects: Task, Learner, Resampling, iteration number (integer(1)), and Prediction. If ids is set to TRUE, character column of extracted ids are added to the table for convenient filtering: "task_id", "learner_id", and "resampling_id". Additionally calculates the provided performance measures and binds the performance as extra columns. These columns are named using the id of the respective Measure.
resample_result(i = NULL, uhash = NULL)
(integer(1), character(1)) -> ResampleResult
Retrieve the i-th ResampleResult, by position or by unique hash uhash. i and uhash are mutually exclusive.
combine(bmr)
(BenchmarkResult | NULL) -> self
Fuses a second BenchmarkResult into itself, mutating the BenchmarkResult in-place. If bmr is NULL, simply returns self.

as.data.table(bmr)
BenchmarkResult -> data.table::data.table()
Returns a copy of the internal data.

set.seed(123)
learners = list(
  lrn("classif.featureless", predict_type = "prob"),
  lrn("classif.rpart", predict_type = "prob")
)

design = benchmark_grid(
  tasks = list(tsk("sonar"), tsk("spam")),
  learners = learners,
  resamplings = rsmp("cv", folds = 3)
)
print(design)

bmr = benchmark(design)
print(bmr)

bmr$tasks
bmr$learners

# first 5 individual resamplings
head(as.data.table(bmr, measures = c("classif.acc", "classif.auc")), 5)

# aggregate results
bmr$aggregate()

# aggregate results with hyperparameters as separate columns
mlr3misc::unnest(bmr$aggregate(params = TRUE), "params")

# extract resample result for classif.rpart
rr = bmr$aggregate()[learner_id == "classif.rpart", resample_result][[1]]
print(rr)

# access the confusion matrix of the first resampling iteration
rr$predictions()[[1]]$confusion