Description Usage Arguments Value Author(s) References Examples
Select features using regularized random forest model. Build random forest model either using or not using feature selection. Compare model performance on an independent test set.
1 | rrf.once(X.train, Y.train, X.test, Y.test, coefReg)
|
X.train |
a data frame or matrix (like x) containing predictors for the training set. |
Y.train |
response for the training set. If a factor, classification is assumed, otherwise regression is assumed. If omitted, will run in unsupervised mode. |
X.test |
a data frame or matrix (like x) containing predictors for the test set. |
Y.test |
response for the test set. |
coefReg |
regularization coefficient chosen for RRF, ranges between 0 and 1. |
return a list, including
perf |
number of feature selected by RRF, performance (AUC or MSE depending on classification or regression) of RF model using all features, performance (AUC or MSE depending on classification or regression) of RF model using selected features |
FullModel |
RF model built with all features |
ReducedModel |
RF model built with only selected features |
featureIndex |
feature index selected by RRF |
Li Liu, Xin Guan
Guan, X., & Liu, L. (2018). Know-GRRF: Domain-Knowledge Informed Biomarker Discovery with Random Forests.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ##---- Example: regression ----
library(randomForest)
set.seed(1)
X<-data.frame(matrix(rnorm(100*100), nrow=100))
b=seq(0.1, 2.2, 0.2)
##y has a linear relationship with first 10 variables
y=b[4]*X$X3+b[5]*X$X4+b[6]*X$X5+b[7]*X$X6+b[8]*X$X7+b[9]*X$X8+b[10]*X$X9+b[11]*X$X10
##split training and test set
X.train=X[1:70,]
X.test=X[71:100,]
y.train=y[1:70]
y.test=y[71:100]
##use RRF to impute regularized coefficients
imp<-randomForest(X.train, y.train)$importance
coefReg=imp/max(imp)
rrf.once(X.train, y.train, X.test, y.test, coefReg)
##---- Example: classification ----
y=as.factor(ifelse(y>0, 1, 0)) ##classification
y.train=y[1:70]
y.test=y[71:100]
##use RRF to impute regularized coefficients
imp<-randomForest(X.train, y.train)$importance
coefReg=imp/max(imp)
rrf.once(X.train, y.train, X.test, y.test, coefReg)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.