In mlr-org/mlr3fselect: Feature Selection for 'mlr3'

lgr::get_logger("mlr3")$set_threshold("warn")
lgr::get_logger("bbotk")$set_threshold("warn")
set.seed(0)
options(
  datatable.print.nrows = 10,
  datatable.print.class = FALSE,
  datatable.print.keys = FALSE,
  datatable.print.trunc.cols = TRUE,
  width = 100)
# mute load messages
library("mlr3fselect")

mlr3fselect

Package website: release | dev

mlr3fselect is the feature selection package of the mlr3 ecosystem. It selects the optimal feature set for any mlr3 learner. The package works with several optimization algorithms e.g. Random Search, Recursive Feature Elimination, and Genetic Search. Moreover, it can automatically optimize learners and estimate the performance of optimized feature sets with nested resampling. The package is built on the optimization framework bbotk.

Resources

There are several section about feature selection in the mlr3book.

Getting started with wrapper feature selection.
Do a sequential forward selection Palmer Penguins data set.
Optimize multiple performance measures.
Estimate Model Performance with nested resampling.

The gallery features a collection of case studies and demos about optimization.

Utilize the built-in feature importance of models with Recursive Feature Elimination.
Run a feature selection with Shadow Variable Search.

The cheatsheet summarizes the most important functions of mlr3fselect.

Installation

Install the last release from CRAN:

install.packages("mlr3fselect")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3fselect")

Example

We run a feature selection for a support vector machine on the Spam data set.

library("mlr3verse")

tsk("spam")

We construct an instance with the fsi() function. The instance describes the optimization problem.

instance = fsi(
  task = tsk("spam"),
  learner = lrn("classif.svm", type = "C-classification"),
  resampling = rsmp("cv", folds = 3),
  measures = msr("classif.ce"),
  terminator = trm("evals", n_evals = 20)
)
instance

We select a simple random search as the optimization algorithm.

fselector = fs("random_search", batch_size = 5)
fselector

To start the feature selection, we simply pass the instance to the fselector.

fselector$optimize(instance)

The fselector writes the best hyperparameter configuration to the instance.

instance$result_feature_set

And the corresponding measured performance.

instance$result_y

The archive contains all evaluated hyperparameter configurations.

as.data.table(instance$archive)

We fit a final model with the optimized feature set to make predictions on new data.

task = tsk("spam")
learner = lrn("classif.svm", type = "C-classification")

task$select(instance$result_feature_set)
learner$train(task)

mlr-org/mlr3fselect documentation built on July 5, 2025, 3:22 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mlr-org/mlr3fselect
Feature Selection for 'mlr3'

In mlr-org/mlr3fselect: Feature Selection for 'mlr3'

mlr3fselect

Resources

Installation

Example

R Package Documentation

Browse R Packages

We want your feedback!

mlr-org/mlr3fselect Feature Selection for 'mlr3'

In mlr-org/mlr3fselect: Feature Selection for 'mlr3'

mlr3fselect

Resources

Installation

Example

R Package Documentation

Browse R Packages

We want your feedback!

mlr-org/mlr3fselect
Feature Selection for 'mlr3'