View source: R/variable_selection_r2vim.R
var.sel.r2vim | R Documentation |
Generates several random forests using all variables and different random number seeds. For each run, the importance score is divided by the (absolute) minimal importance score (relative importance scores). Variables are selected if the minimal relative importance score is >= factor.
var.sel.r2vim( x, y, no.runs = 10, factor = 1, ntree = 500, mtry.prop = 0.2, nodesize.prop = 0.1, no.threads = 1, method = "ranger", type = "regression", importance = "impurity_corrected", case.weights = NULL )
x |
matrix or data.frame of predictor variables with variables in columns and samples in rows (Note: missing values are not allowed). |
y |
vector with values of phenotype variable (Note: will be converted to factor if classification mode is used). |
no.runs |
number of random forests to be generated |
factor |
minimal relative importance score for a variable to be selected |
ntree |
number of trees. |
mtry.prop |
proportion of variables that should be used at each split. |
nodesize.prop |
proportion of minimal number of samples in terminal nodes. |
no.threads |
number of threads used for parallel execution. |
method |
implementation to be used ("ranger"). |
type |
mode of prediction ("regression", "classification" or "probability"). |
importance |
Variable importance mode ('none', 'impurity', 'impurity_corrected' or 'permutation'). Default is 'impurity_corrected'. |
case.weights |
Weights for sampling of training observations. Observations with larger weights will be selected with higher probability in the bootstrap (or subsampled) samples for the trees. |
Note: This function is a reimplementation of the R package RFVarSelGWAS
.
List with the following components:
info
data.frame
with information for each variable
vim.run.x = original variable importance (VIM) in run x
rel.vim.run.x = relative VIM in run x
rel.vim.min = minimal relative VIM over all runs
rel.vim.med = median relative VIM over all runs
selected = variable has been selected
var
vector of selected variables
@examples # simulate toy data set data = simulation.data.cor(no.samples = 100, group.size = rep(10, 6), no.var.total = 200)
# select variables res = var.sel.r2vim(x = data[, -1], y = data[, 1], no.runs = 5, factor = 1) res$var
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.