View source: R/variable_selection_boruta.R
| var.sel.boruta | R Documentation |
Variable selection using the Boruta function in the R package Boruta.
var.sel.boruta( x, y, pValue = 0.01, maxRuns = 100, ntree = 500, mtry.prop = 0.2, nodesize.prop = 0.1, no.threads = 1, method = "ranger", type = "regression", importance = "impurity_corrected", case.weights = NULL )
x |
matrix or data.frame of predictor variables with variables in columns and samples in rows (Note: missing values are not allowed). |
y |
vector with values of phenotype variable (Note: will be converted to factor if classification mode is used). |
pValue |
confidence level (default: 0.01 based on Boruta package) |
maxRuns |
maximal number of importance source runs (default: 100 based on Boruta package) |
ntree |
number of trees. |
mtry.prop |
proportion of variables that should be used at each split. |
nodesize.prop |
proportion of minimal number of samples in terminal nodes. |
no.threads |
number of threads used for parallel execution. |
method |
implementation to be used ("ranger"). |
type |
mode of prediction ("regression", "classification" or "probability"). |
importance |
Variable importance mode ('none', 'impurity', 'impurity_corrected' or 'permutation'). Default is 'impurity_corrected'. |
case.weights |
Weights for sampling of training observations. Observations with larger weights will be selected with higher probability in the bootstrap (or subsampled) samples for the trees. |
This function selects only variables that are confirmed based on Boruta implementation.
For more details see Boruta.
Note that this function uses the ranger implementation for variable selection.
List with the following components:
info data.frame
with information of each variable
run.x = original variable importance (VIM) in run x (includes min, mean and max of VIM of shadow variables)
decision = Boruta decision (Confirmed, Rejected or Tentative)
selected = variable has been selected
var vector of selected variables
info.shadow.var data.frame with information about
minimal, mean and maximal shadow variables of each run
@examples # simulate toy data set data = simulation.data.cor(no.samples = 100, group.size = rep(10, 6), no.var.total = 200)
# select variables res = var.sel.boruta(x = data[, -1], y = data[, 1]) res$var
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.