preprocessing_feature_selection: Conducts a feature selection process with one out of five...
In ModelOriented/forester: Quick and Simple Tools for Training and Testing of Tree-Based Models

View source: R/preprocessing_feature_selection.R

preprocessing_feature_selection

R Documentation

Conducts a feature selection process with one out of five proposed methods

Description

`VI` The variable importance method based on random forest - long time, worst results,
`MCFS` The Monte Carlo Feature Selection - short time, reasonable results,
`MI` The Varrank method based on mutual information scores - short time, if we set too big 'max_features' it can work really long, bad results,
`BORUTA` The BORUTA algorithm - long time, best results.

Usage

preprocessing_feature_selection(
  data,
  y,
  feature_selection_method = "BORUTA",
  max_features = "default",
  nperm = 1,
  cutoffPermutations = 20,
  threadsNumber = NULL,
  method = "estevez",
  verbose = FALSE
)

Arguments

`data`	A data source, that is one of the major R formats: data.table, data.frame, matrix and so on.
`y`	A string that indicates a target column name.
`feature_selection_method`	A string value indication the feature selection method. The imputation method must be one of 'VI', 'MCFS', 'MI', or 'BORUTA' (default).
`max_features`	A positive integer value describing the desired number of selected features. Initial value set as 'default' which is min(10, ncol(data) - 1) for 'VI' and 'MI', and NULL (number of relevant features chosen by the method) for ‘MCFS'. Only 'MCFS' can use the NULL value. 'BORUTA' doesn’t use this parameter.
`nperm`	An integer describing the number of permutations performed, relevant for the 'VI' method. By default set to 1.
`cutoffPermutations`	An non-negative integer value that determines the number of permutation runs. It needs at least 20 permutations for a statistically significant result. Minimum value of this parameter is 3, however if it is 0 then permutations method is turned off. Relevant for the 'MCFS' method.
`threadsNumber`	A positive integer value describing the number of threads to use in computation. More threads needs more CPU cores as well as memory usage is a bit higher. It is recommended to set this value equal to or less than CPU available cores. By default set to NULL, which will use maximal number of cores minus 1. Relevant for the 'MCFS' method.
`method`	A string that indicates which algorithm will be used for 'MI' method. Available options are the default 'estevez' which works well for smaller datasets, but can raise errors for bigger ones, and simpler 'peng'. More details present in the documentation of ?varrank method.
`verbose`	A logical value, if set to TRUE, provides all information about preprocessing process, if FALSE gives none.

Value

A list containing two objects:

`data` A dataset with selected columns,
`idx` The indexes of removed columns.

ModelOriented/forester documentation built on June 6, 2024, 7:29 a.m.

ModelOriented/forester index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ModelOriented/forester
Quick and Simple Tools for Training and Testing of Tree-Based Models

preprocessing_feature_selection: Conducts a feature selection process with one out of five...
In ModelOriented/forester: Quick and Simple Tools for Training and Testing of Tree-Based Models

Conducts a feature selection process with one out of five proposed methods

Description

Usage

Arguments

Value

Related to preprocessing_feature_selection in ModelOriented/forester...

R Package Documentation

Browse R Packages

We want your feedback!

ModelOriented/forester Quick and Simple Tools for Training and Testing of Tree-Based Models

preprocessing_feature_selection: Conducts a feature selection process with one out of five... In ModelOriented/forester: Quick and Simple Tools for Training and Testing of Tree-Based Models

Conducts a feature selection process with one out of five proposed methods

Description

Usage

Arguments

Value

Related to preprocessing_feature_selection in ModelOriented/forester...

R Package Documentation

Browse R Packages

We want your feedback!

ModelOriented/forester
Quick and Simple Tools for Training and Testing of Tree-Based Models

preprocessing_feature_selection: Conducts a feature selection process with one out of five...
In ModelOriented/forester: Quick and Simple Tools for Training and Testing of Tree-Based Models