Description Usage Arguments Value Note Author(s) References Examples
View source: R/select.stable.R
Perform feature selection by GRRF. Repeat it multiple times to select a stable set of features that are consistently selected according to the selection frequency.
1 | select.stable(X.train, Y.train, coefReg, total=10, cutoff=0.5)
|
X.train |
a data frame or matrix (like x) containing predictors for the training set. |
Y.train |
response for the training set. If a factor, classification is assumed, otherwise regression is assumed. If omitted, will run in unsupervised mode. |
coefReg |
regularization coefficient chosen for RRF, ranges between 0 and 1. |
total |
the number of times to repeat the process. |
cutoff |
The minimum percentage of times that the feature is selected by RRF, ranges between 0 and 1. |
a stable set of features selected by GRRF
For customized hyperparameter setting, can directly call RRF function from RRF package repeatly in a for loop.
Li Liu, Xin Guan
Guan, X., & Liu, L. (2018). Know-GRRF: Domain-Knowledge Informed Biomarker Discovery with Random Forests.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ##---- Example: classification ----
library(randomForest)
set.seed(1)
X.train<-data.frame(matrix(rnorm(100*100), nrow=100))
b=seq(0.1, 2.2, 0.2)
##y has a linear relationship with first 10 variables
y.train=b[7]*X.train$X6+b[8]*X.train$X7+b[9]*X.train$X8+b[10]*X.train$X9+b[11]*X.train$X10
y.train=ifelse(y.train>0, 1, 0) ##classification
##use RRF to impute regularized coefficients
imp<-randomForest(X.train, as.factor(y.train))$importance
coefReg=0.5+0.5*imp/max(imp)
##select a stable set of feature that are consistently selected more than half of times
select.stable(X.train, as.factor(y.train), coefReg)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.