fsi: Syntactic Sugar for Instance Construction

View source: R/sugar.R

fsiR Documentation

Syntactic Sugar for Instance Construction

Description

Function to construct a FSelectInstanceSingleCrit or FSelectInstanceMultiCrit.

Usage

fsi(
  task,
  learner,
  resampling,
  measures = NULL,
  terminator,
  store_benchmark_result = TRUE,
  store_models = FALSE,
  check_values = FALSE,
  callbacks = list()
)

Arguments

task

(mlr3::Task)
Task to operate on.

learner

(mlr3::Learner)
Learner to optimize the feature subset for.

resampling

(mlr3::Resampling)
Resampling that is used to evaluated the performance of the feature subsets. Uninstantiated resamplings are instantiated during construction so that all feature subsets are evaluated on the same data splits. Already instantiated resamplings are kept unchanged.

measures

(mlr3::Measure or list of mlr3::Measure)
A single measure creates a FSelectInstanceSingleCrit and multiple measures a FSelectInstanceMultiCrit. If NULL, default measure is used.

terminator

(Terminator)
Stop criterion of the feature selection.

store_benchmark_result

(logical(1))
Store benchmark result in archive?

store_models

(logical(1)). Store models in benchmark result?

check_values

(logical(1))
Check the parameters before the evaluation and the results for validity?

callbacks

(list of CallbackFSelect)
List of callbacks.

Resources

There are several sections about feature selection in the mlr3book.

The gallery features a collection of case studies and demos about optimization.

Default Measures

If no measure is passed, the default measure is used. The default measure depends on the task type.

Task Default Measure Package
"classif" "classif.ce" mlr3
"regr" "regr.mse" mlr3
"surv" "surv.cindex" mlr3proba
"dens" "dens.logloss" mlr3proba
"classif_st" "classif.ce" mlr3spatial
"regr_st" "regr.mse" mlr3spatial
"clust" "clust.dunn" mlr3cluster

Examples

# Feature selection on Palmer Penguins data set


task = tsk("penguins")
learner = lrn("classif.rpart")

# Construct feature selection instance
instance = fsi(
  task = task,
  learner = learner,
  resampling = rsmp("cv", folds = 3),
  measures = msr("classif.ce"),
  terminator = trm("evals", n_evals = 4)
)

# Choose optimization algorithm
fselector = fs("random_search", batch_size = 2)

# Run feature selection
fselector$optimize(instance)

# Subset task to optimal feature set
task$select(instance$result_feature_set)

# Train the learner with optimal feature set on the full data set
learner$train(task)

# Inspect all evaluated sets
as.data.table(instance$archive)


mlr3fselect documentation built on March 7, 2023, 5:31 p.m.