sfs: Sequential Forward Selection

Description Usage Arguments Details Value Author(s) References Examples

View source: R/sfs.R

Description

Applies the Sequential Forward Selection algorithm for Feature Selection.

Usage

1
2
sfs(data, method = c("lda", "knn", "rpart"), kvec = 5,
 repet = 10)

Arguments

data

Dataset to be used for feature selection

method

Classifier to be used, currently only the lda, knn and rpart classifiers are supported

kvec

Number of neighbors to use for the knn classification

repet

Number of times to repeat the selection.

Details

The best subset of features, T, is initialized as the empty set and at each step the feature that gives the highest correct classification rate along with the features already in T, is added to set. The "best subset" of features is constructed based on the frequency with which each attribute is selected in the number of repetitions given. Due to the time complexity of the algorithm its use is not recommended for datasets with a large number of attributes(say more than 1000).

Value

bestsubset

subset of features that have been determined to be relevant.

Author(s)

Edgar Acuna

References

Acuna, E , (2003) A comparison of filters and wrappers for feature selection in supervised classification. Proceedings of the Interface 2003 Computing Science and Statistics. Vol 34.

Examples

1
2
3
#---- Sequential forward selection using the knn classifier----
data(iris)
sfs(iris,method="lda",repet=3)

Example output

Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl_init' failed, running with rgl.useNULL = TRUE 
3: .onUnload failed in unloadNamespace() for 'rgl', details:
  call: fun(...)
  error: object 'rgl_quit' not found 
The best subset of features is:
[1] 3

dprep documentation built on May 29, 2017, 11:01 a.m.