View source: R/variable_selection_vita.R
var.sel.vita | R Documentation |
This function calculates p-values based on the empirical null distribution from non-positive VIMs as
described in Janitza et al. (2015). Note that this function uses the importance_pvalues
function in the R package
ranger
.
var.sel.vita( x, y, p.t = 0.05, ntree = 500, mtry.prop = 0.2, nodesize.prop = 0.1, no.threads = 1, method = "ranger", type = "regression", importance = "impurity_corrected" )
x |
matrix or data.frame of predictor variables with variables in columns and samples in rows (Note: missing values are not allowed). |
y |
vector with values of phenotype variable (Note: will be converted to factor if classification mode is used). |
p.t |
threshold for p-values (all variables with a p-value = 0 or < p.t will be selected) |
ntree |
number of trees. |
mtry.prop |
proportion of variables that should be used at each split. |
nodesize.prop |
proportion of minimal number of samples in terminal nodes. |
no.threads |
number of threads used for parallel execution. |
method |
implementation to be used ("ranger"). |
type |
mode of prediction ("regression", "classification" or "probability"). |
importance |
Variable importance mode ('none', 'impurity', 'impurity_corrected' or 'permutation'). Default is 'impurity_corrected'. |
List with the following components:
info
data.frame
with information for each variable
vim = variable importance (VIM)
CI_lower = lower confidence interval boundary
CI_upper = upper confidence interval boundary
pvalue = empirical p-value
selected = variable has been selected
var
vector of selected variables
@references Janitza, S., Celik, E. & Boulesteix, A.-L., (2015). A computationally fast variable importance test for random forest for high dimensional data, Technical Report 185, University of Munich, https://epub.ub.uni-muenchen.de/25587.
@examples # simulate toy data set data = simulation.data.cor(no.samples = 100, group.size = rep(10, 6), no.var.total = 500)
# select variables res = var.sel.vita(x = data[, -1], y = data[, 1], p.t = 0.05) res$var
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.