rrf.once: Feature selection by regularized random forest and compare...

Description Usage Arguments Value Author(s) References Examples

View source: R/rrf.once.R

Description

Select features using regularized random forest model. Build random forest model either using or not using feature selection. Compare model performance on an independent test set.

Usage

1
rrf.once(X.train, Y.train, X.test, Y.test, coefReg)

Arguments

X.train

a data frame or matrix (like x) containing predictors for the training set.

Y.train

response for the training set. If a factor, classification is assumed, otherwise regression is assumed. If omitted, will run in unsupervised mode.

X.test

a data frame or matrix (like x) containing predictors for the test set.

Y.test

response for the test set.

coefReg

regularization coefficient chosen for RRF, ranges between 0 and 1.

Value

return a list, including

perf

number of feature selected by RRF, performance (AUC or MSE depending on classification or regression) of RF model using all features, performance (AUC or MSE depending on classification or regression) of RF model using selected features

FullModel

RF model built with all features

ReducedModel

RF model built with only selected features

featureIndex

feature index selected by RRF

Author(s)

Li Liu, Xin Guan

References

Guan, X., & Liu, L. (2018). Know-GRRF: Domain-Knowledge Informed Biomarker Discovery with Random Forests.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
##---- Example: regression ----
library(randomForest)

set.seed(1)
X<-data.frame(matrix(rnorm(100*100), nrow=100))
b=seq(0.1, 2.2, 0.2) 
##y has a linear relationship with first 10 variables
y=b[4]*X$X3+b[5]*X$X4+b[6]*X$X5+b[7]*X$X6+b[8]*X$X7+b[9]*X$X8+b[10]*X$X9+b[11]*X$X10 


##split training and test set
X.train=X[1:70,]
X.test=X[71:100,]
y.train=y[1:70]
y.test=y[71:100]

##use RRF to impute regularized coefficients
imp<-randomForest(X.train, y.train)$importance 
coefReg=imp/max(imp) 

rrf.once(X.train, y.train, X.test, y.test, coefReg)

##---- Example: classification ----
y=as.factor(ifelse(y>0, 1, 0)) ##classification
y.train=y[1:70]
y.test=y[71:100]

##use RRF to impute regularized coefficients
imp<-randomForest(X.train, y.train)$importance 
coefReg=imp/max(imp) 

rrf.once(X.train, y.train, X.test, y.test, coefReg)

KnowGRRF documentation built on May 2, 2019, 6:43 a.m.

Related to rrf.once in KnowGRRF...