covsurf: Combining ClustOfVar and VSURF

Description Usage Arguments Value References Examples

View source: R/CoVVSURF.R

Description

This function selects groups of informative input variables to predict a response variable.

Usage

1
covsurf(X, y, kval = 2:ncol(X), tree = NULL, nse = 1, ncores = 1, ...)

Arguments

X

dataframe of input variables

y

vector of responses

kval

vector of number of classes to try

tree

optional tree given by hclustvar

nse

number of standard-deviation to add to select minimum of OOB rate

ncores

number of cores to use for parallel computing

...

passed to VSURF

Value

kopt

the optimal number of groups of variables

ptree

the partition in kopt clusters of the dendrogram of CoV.

vsurf_ptree

VSURF applied to the kopt synthetic variables.

vsel

synthetic variables selected by VSURF.

csel

groups of variables selected by VSURF.

rfsel

RF applied to the selected synthetic variables

rfclust

RF applied to all the synthetic variables.

oob

a matrix with mean OOB error (first column) and OOB standard deviation (second column).

References

Combining clustering of variables and feature selection using random forests: the CoV/VSURF procedure, Marie Chavent, Robin Genuer, Jerome Saracco, hal-01345840

Examples

1
2
3
4
data(don60)
kval <- c(2:15, seq(from = 20, to = ncol(X), by = 10))
don60covs <- covsurf(X, y, kval)
plot(don60covs)

robingenuer/CoVVSURF documentation built on May 27, 2019, 11:38 a.m.