The FSelectInstanceSingleCrit specifies a feature selection problem for FSelectors.
fsi() creates a FSelectInstanceSingleCrit and the function
fselect() creates an instance internally.
The instance contains an ObjectiveFSelect object that encodes the black box objective function a FSelector has to optimize.
The instance allows the basic operations of querying the objective at design points (
This operation is usually done by the FSelector.
Evaluations of feature subsets are performed in batches by calling
The evaluated feature subsets are stored in the Archive (
Before a batch is evaluated, the bbotk::Terminator is queried for the remaining budget.
If the available budget is exhausted, an exception is raised, and no further evaluations can be performed from this point on.
The FSelector is also supposed to store its final result, consisting of a selected feature subset and associated estimated performance values, by calling the method
If no measure is passed, the default measure is used. The default measure depends on the task type.
There are several sections about feature selection in the mlr3book.
Getting started with wrapper feature selection.
The gallery features a collection of case studies and demos about optimization.
Utilize the built-in feature importance of models with Recursive Feature Elimination.
Run a feature selection with Shadow Variable Search.
Feature Selection on the Titanic data set.
For analyzing the feature selection results, it is recommended to pass the archive to
The returned data table is joined with the benchmark result which adds the mlr3::ResampleResult for each feature set.
The archive provides various getters (e.g.
$learners()) to ease the access.
All getters extract by position (
i) or unique hash (
For a complete list of all getters see the methods section.
The benchmark result (
$benchmark_result) allows to score the feature sets again on a different measure.
Alternatively, measures can be supplied to
Feature set for task subsetting.
Creates a new instance of this R6 class.
FSelectInstanceSingleCrit$new( task, learner, resampling, measure, terminator, store_benchmark_result = TRUE, store_models = FALSE, check_values = FALSE, callbacks = list() )
Task to operate on.
Learner to optimize the feature subset for.
Resampling that is used to evaluated the performance of the feature subsets. Uninstantiated resamplings are instantiated during construction so that all feature subsets are evaluated on the same data splits. Already instantiated resamplings are kept unchanged.
Measure to optimize. If
NULL, default measure is used.
Stop criterion of the feature selection.
Store benchmark result in archive?
Store models in benchmark result?
Check the parameters before the evaluation and the results for validity?
(list of CallbackFSelect)
List of callbacks.
The FSelector writes the best found feature subset and estimated performance value here. For internal use.
x values as
data.table. Each row is one point. Contains the value in
the search space of the FSelectInstanceMultiCrit object. Can contain
additional columns for extra information.
The objects of this class are cloneable with this method.
FSelectInstanceSingleCrit$clone(deep = FALSE)
Whether to make a deep clone.
# Feature selection on Palmer Penguins data set task = tsk("penguins") learner = lrn("classif.rpart") # Construct feature selection instance instance = fsi( task = task, learner = learner, resampling = rsmp("cv", folds = 3), measures = msr("classif.ce"), terminator = trm("evals", n_evals = 4) ) # Choose optimization algorithm fselector = fs("random_search", batch_size = 2) # Run feature selection fselector$optimize(instance) # Subset task to optimal feature set task$select(instance$result_feature_set) # Train the learner with optimal feature set on the full data set learner$train(task) # Inspect all evaluated sets as.data.table(instance$archive)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.