The AutoFSelector wraps a mlr3::Learner and augments it with an automatic feature selection.
auto_fselector() function creates an AutoFSelector object.
The AutoFSelector is a mlr3::Learner which wraps another mlr3::Learner and performs the following steps during
The wrapped (inner) learner is trained on the feature subsets via resampling. The feature selection can be specified by providing a FSelector, a bbotk::Terminator, a mlr3::Resampling and a mlr3::Measure.
A final model is fit on the complete training data with the best-found feature subset.
$predict() the AutoFSelector just calls the predict method of the wrapped (inner) learner.
There are several sections about feature selection in the mlr3book.
Estimate Model Performance with nested resampling (Tuning workflow is transferable to feature selection).
Automate the feature selection.
The gallery features a collection of case studies and demos about optimization.
Nested resampling can be performed by passing an AutoFSelector object to
To access the inner resampling results, set
store_fselect_instance = TRUE and execute
store_models = TRUE (see examples).
The mlr3::Resampling passed to the AutoFSelector is meant to be the inner resampling, operating on the training set of an arbitrary outer resampling.
For this reason it is not feasible to pass an instantiated mlr3::Resampling here.
All arguments from construction to create the FSelectInstanceSingleCrit.
Returns FSelectInstanceSingleCrit archive.
Internally created feature selection instance with all intermediate results.
$result from FSelectInstanceSingleCrit.
Stores the currently active predict type, e.g.
Must be an element of
Hash (unique identifier) for this object.
Creates a new instance of this R6 class.
AutoFSelector$new( fselector, learner, resampling, measure = NULL, terminator, store_fselect_instance = TRUE, store_benchmark_result = TRUE, store_models = FALSE, check_values = FALSE, callbacks = list() )
Learner to optimize the feature subset for.
Resampling that is used to evaluated the performance of the feature subsets. Uninstantiated resamplings are instantiated during construction so that all feature subsets are evaluated on the same data splits. Already instantiated resamplings are kept unchanged.
Measure to optimize. If
NULL, default measure is used.
Stop criterion of the feature selection.
TRUE (default), stores the internally created FSelectInstanceSingleCrit with all intermediate results in slot
Is set to
store_models = TRUE
Store benchmark result in archive?
Store models in benchmark result?
Check the parameters before the evaluation and the results for validity?
(list of CallbackFSelect)
List of callbacks.
Extracts the base learner from nested learner objects like
GraphLearner in mlr3pipelines.
recursive = 0, the (tuned) learner is returned.
AutoFSelector$base_learner(recursive = Inf)
Depth of recursion for multiple nested objects.
The importance scores of the final model.
The selected features of the final model. These features are selected internally by the learner.
The out-of-bag error of the final model.
The log-likelihood of the final model.
The objects of this class are cloneable with this method.
AutoFSelector$clone(deep = FALSE)
Whether to make a deep clone.
# Automatic Feature Selection # split to train and external set task = tsk("penguins") split = partition(task, ratio = 0.8) # create auto fselector afs = auto_fselector( fselector = fs("random_search"), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), measure = msr("classif.ce"), term_evals = 4) # optimize feature subset and fit final model afs$train(task, row_ids = split$train) # predict with final model afs$predict(task, row_ids = split$test) # show result afs$fselect_result # model slot contains trained learner and fselect instance afs$model # shortcut trained learner afs$learner # shortcut fselect instance afs$fselect_instance # Nested Resampling afs = auto_fselector( fselector = fs("random_search"), learner = lrn("classif.rpart"), resampling = rsmp ("holdout"), measure = msr("classif.ce"), term_evals = 4) resampling_outer = rsmp("cv", folds = 3) rr = resample(task, afs, resampling_outer, store_models = TRUE) # retrieve inner feature selection results. extract_inner_fselect_results(rr) # performance scores estimated on the outer resampling rr$score() # unbiased performance of the final model trained on the full data set rr$aggregate()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.