fselect | R Documentation |
Function to optimize the features of a mlr3::Learner.
The function internally creates a FSelectInstanceBatchSingleCrit or FSelectInstanceBatchMultiCrit which describes the feature selection problem.
It executes the feature selection with the FSelector (method
) and returns the result with the fselect instance ($result
).
The ArchiveBatchFSelect ($archive
) stores all evaluated hyperparameter configurations and performance scores.
fselect(
fselector,
task,
learner,
resampling,
measures = NULL,
term_evals = NULL,
term_time = NULL,
terminator = NULL,
store_benchmark_result = TRUE,
store_models = FALSE,
check_values = FALSE,
callbacks = NULL,
ties_method = "least_features"
)
fselector |
(FSelector) |
task |
(mlr3::Task) |
learner |
(mlr3::Learner) |
resampling |
(mlr3::Resampling) |
measures |
(mlr3::Measure or list of mlr3::Measure) |
term_evals |
( |
term_time |
( |
terminator |
(bbotk::Terminator) |
store_benchmark_result |
( |
store_models |
( |
check_values |
( |
callbacks |
(list of CallbackBatchFSelect) |
ties_method |
( |
The mlr3::Task, mlr3::Learner, mlr3::Resampling, mlr3::Measure and bbotk::Terminator are used to construct a FSelectInstanceBatchSingleCrit.
If multiple performance Measures are supplied, a FSelectInstanceBatchMultiCrit is created.
The parameter term_evals
and term_time
are shortcuts to create a bbotk::Terminator.
If both parameters are passed, a bbotk::TerminatorCombo is constructed.
For other Terminators, pass one with terminator
.
If no termination criterion is needed, set term_evals
, term_time
and terminator
to NULL
.
FSelectInstanceBatchSingleCrit | FSelectInstanceBatchMultiCrit
There are several sections about feature selection in the mlr3book.
Getting started with wrapper feature selection.
Do a sequential forward selection Palmer Penguins data set.
The gallery features a collection of case studies and demos about optimization.
Utilize the built-in feature importance of models with Recursive Feature Elimination.
Run a feature selection with Shadow Variable Search.
Feature Selection on the Titanic data set.
For analyzing the feature selection results, it is recommended to pass the archive to as.data.table()
.
The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.
The archive provides various getters (e.g. $learners()
) to ease the access.
All getters extract by position (i
) or unique hash (uhash
).
For a complete list of all getters see the methods section.
The benchmark result ($benchmark_result
) allows to score the feature sets again on a different measure.
Alternatively, measures can be supplied to as.data.table()
.
# Feature selection on the Palmer Penguins data set
task = tsk("pima")
learner = lrn("classif.rpart")
# Run feature selection
instance = fselect(
fselector = fs("random_search"),
task = task,
learner = learner,
resampling = rsmp ("holdout"),
measures = msr("classif.ce"),
term_evals = 4)
# Subset task to optimized feature set
task$select(instance$result_feature_set)
# Train the learner with optimal feature set on the full data set
learner$train(task)
# Inspect all evaluated configurations
as.data.table(instance$archive)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.