var.sel.vita: Variable selection using Vita approach.
In silkeszy/Pomona: Identification of relevant variables in omics data sets using Random Forests

View source: R/variable_selection_vita.R

var.sel.vita

R Documentation

Variable selection using Vita approach.

Description

This function calculates p-values based on the empirical null distribution from non-positive VIMs as described in Janitza et al. (2015). Note that this function uses the importance_pvalues function in the R package ranger.

Usage

var.sel.vita(
  x,
  y,
  p.t = 0.05,
  ntree = 500,
  mtry.prop = 0.2,
  nodesize.prop = 0.1,
  no.threads = 1,
  method = "ranger",
  type = "regression",
  importance = "impurity_corrected"
)

Arguments

`x`	matrix or data.frame of predictor variables with variables in columns and samples in rows (Note: missing values are not allowed).
`y`	vector with values of phenotype variable (Note: will be converted to factor if classification mode is used).
`p.t`	threshold for p-values (all variables with a p-value = 0 or < p.t will be selected)
`ntree`	number of trees.
`mtry.prop`	proportion of variables that should be used at each split.
`nodesize.prop`	proportion of minimal number of samples in terminal nodes.
`no.threads`	number of threads used for parallel execution.
`method`	implementation to be used ("ranger").
`type`	mode of prediction ("regression", "classification" or "probability").
`importance`	Variable importance mode ('none', 'impurity', 'impurity_corrected' or 'permutation'). Default is 'impurity_corrected'.

Value

List with the following components:

info data.frame with information for each variable
- vim = variable importance (VIM)
- CI_lower = lower confidence interval boundary
- CI_upper = upper confidence interval boundary
- pvalue = empirical p-value
- selected = variable has been selected
var vector of selected variables

@references Janitza, S., Celik, E. & Boulesteix, A.-L., (2015). A computationally fast variable importance test for random forest for high dimensional data, Technical Report 185, University of Munich, https://epub.ub.uni-muenchen.de/25587.

@examples # simulate toy data set data = simulation.data.cor(no.samples = 100, group.size = rep(10, 6), no.var.total = 500)

# select variables res = var.sel.vita(x = data[, -1], y = data[, 1], p.t = 0.05) res$var

silkeszy/Pomona documentation built on March 31, 2022, 11:13 p.m.

silkeszy/Pomona index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

silkeszy/Pomona
Identification of relevant variables in omics data sets using Random Forests

var.sel.vita: Variable selection using Vita approach.
In silkeszy/Pomona: Identification of relevant variables in omics data sets using Random Forests

Variable selection using Vita approach.

Description

Usage

Arguments

Value

Related to var.sel.vita in silkeszy/Pomona...

R Package Documentation

Browse R Packages

We want your feedback!

silkeszy/Pomona Identification of relevant variables in omics data sets using Random Forests

var.sel.vita: Variable selection using Vita approach. In silkeszy/Pomona: Identification of relevant variables in omics data sets using Random Forests

Variable selection using Vita approach.

Description

Usage

Arguments

Value

Related to var.sel.vita in silkeszy/Pomona...

R Package Documentation

Browse R Packages

We want your feedback!

silkeszy/Pomona
Identification of relevant variables in omics data sets using Random Forests

var.sel.vita: Variable selection using Vita approach.
In silkeszy/Pomona: Identification of relevant variables in omics data sets using Random Forests