FeatureSelection: Feature selection for conditional random forests.

View source: R/FeatureSelection.R

FeatureSelectionR Documentation

Feature selection for conditional random forests.

Description

Performs feature selection for a conditional random forest model. Four approaches are available : non-recursive feature elimination (NRFE), recursive feature elimination (RFE), permutation test approach with permuted response (Altmann et al, 2010), permutation test approach with permuted predictors (Hapfelmeier et Ulm, 2013).

Usage

FeatureSelection(Y, X, method = 'NRFE', ntree = 1000, measure = NULL,
                 nperm = 30, alpha = 0.05, distrib = 'approx',
                 parallel = FALSE, ...)

Arguments

Y

response vector. Must be of class factor or numeric

X

matrix or data frame containing the predictors

method

method for feature selection. Should be 'NRFE' (non-recursive feature elimination, default), 'RFE' (recursive feature elimination), 'ALT' (permutation of response) or 'HAPF' (permutation of predictors)

ntree

number of trees contained in a forest

measure

the name of the measure of the measures package that should be used for error and variable importance calculations.

nperm

number of permutations. Only for 'ALT' and 'HAPF' methods.

alpha

alpha level for permutation tests. Only for 'ALT' and 'HAPF' methods.

distrib

the null distribution of the variable importance can be approximated by its asymptotic distribution ("asympt") or via Monte Carlo resampling ("approx", default). Only for 'ALT' and 'HAPF' methods.

parallel

Logical indicating whether or not to run fastvarImp in parallel using a backend provided by the foreach package. Default is FALSE.

...

Further arguments (like positive or negative class) that are needed by the measure.

Details

To be developed soon !

Value

A list with the following elements :

selection.0se

selected variables with the 0 standard error rule

forest.0se

forest corresponding the variables selected with the 0 standard error rule

oob.error.0se

OOB error of the forest with 0 standard error rule

selection.1se

selected variables with the 1 standard error rule

forest.1se

forest corresponding the variables selected with the 1 standard error rule

oob.error.1se

OOB error of the forest with 1 standard error rule

Note

The code is adapted from Hapfelmeier & Ulm (2013).

Only works for regression and binary classification.

Author(s)

Nicolas Robette

References

B. Gregorutti, B. Michel, and P. Saint Pierre. "Correlation and variable importance in random forests". arXiv:1310.5726, 2017.

A. Hapfelmeier and K. Ulm. "A new variable selection approach using random forests". Computational Statistics and Data Analysis, 60:50–69, 2013.

A. Altmann, L. Toloşi, O. Sander et T. Lengauer. "Permutation importance: a corrected feature importance measure". Bioinformatics, 26(10):1340-1347, 2010.

Examples

  data(iris)
  iris2 = iris
  iris2$Species = factor(iris$Species == "versicolor")
  featsel <- FeatureSelection(iris2$Species, iris2[,1:4], measure='ACC', ntree=200)
  featsel$selection.0se
  featsel$selection.1se

moreparty documentation built on Nov. 22, 2023, 5:08 p.m.