select.stable: Select a set of stable features based on frequency picked by...

Description Usage Arguments Value Note Author(s) References Examples

View source: R/select.stable.R

Description

Perform feature selection by GRRF. Repeat it multiple times to select a stable set of features that are consistently selected according to the selection frequency.

Usage

1
select.stable(X.train, Y.train, coefReg, total=10, cutoff=0.5) 

Arguments

X.train

a data frame or matrix (like x) containing predictors for the training set.

Y.train

response for the training set. If a factor, classification is assumed, otherwise regression is assumed. If omitted, will run in unsupervised mode.

coefReg

regularization coefficient chosen for RRF, ranges between 0 and 1.

total

the number of times to repeat the process.

cutoff

The minimum percentage of times that the feature is selected by RRF, ranges between 0 and 1.

Value

a stable set of features selected by GRRF

Note

For customized hyperparameter setting, can directly call RRF function from RRF package repeatly in a for loop.

Author(s)

Li Liu, Xin Guan

References

Guan, X., & Liu, L. (2018). Know-GRRF: Domain-Knowledge Informed Biomarker Discovery with Random Forests.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
##---- Example: classification ----
library(randomForest)

set.seed(1)
X.train<-data.frame(matrix(rnorm(100*100), nrow=100))
b=seq(0.1, 2.2, 0.2) 
##y has a linear relationship with first 10 variables
y.train=b[7]*X.train$X6+b[8]*X.train$X7+b[9]*X.train$X8+b[10]*X.train$X9+b[11]*X.train$X10 
y.train=ifelse(y.train>0, 1, 0) ##classification

##use RRF to impute regularized coefficients
imp<-randomForest(X.train, as.factor(y.train))$importance 
coefReg=0.5+0.5*imp/max(imp) 

##select a stable set of feature that are consistently selected more than half of times
select.stable(X.train, as.factor(y.train), coefReg)

KnowGRRF documentation built on May 2, 2019, 6:43 a.m.