aucMCV: AUC multiple cross-validation

Description Usage Arguments Details Author(s) References Examples

Description

This function implements the AUCRF algorithm for identifying the variables (metabolites) most relevant for the classification task

Usage

1
2
3
aucMCV(data, seed = 1234, ref_level = levels(data[, 2])[1],
  auc_rank = "MDG", auc_ntree = 500, auc_nfolds = 5, auc_pdel = 0.2,
  auc_colour = "grey", auc_iterations = 5)

Arguments

data

a n x p dataframe used to execute the AUCRF algorithm and perform a repetead CV of the AUCRF process. The dependent variable must be a binary variable defined as a factor and codified as 0 for negatives (e.g controls) and 1 for positivies (e.g. cases)

seed

a numeric value to set the seed of R's random number generator

ref_level

the class assumed as reference for the binary classification

auc_rank

the importance measure provided by randomForest for ranking the variables. There are two options: MDG (default) and MDA

auc_ntree

the number of tree of each random forest model used

auc_nfolds

the number of folds in cross validation. By default a 5-fold cross validation is performed

auc_pdel

the fraction of variables to be removed at each step. If auc_pdel = 0, it will be removed only one variable at each step

auc_colour

the color chosen

auc_iterations

a numeric that represents the number of cross validation repetitions

Details

Exploting the AUCRF algorithm, the fuction allows to identify the best performing 'parsimonious' model in terms of OOB-AUC and the most relevant variables (metabolites) involved in the prediction task.

Author(s)

Piergiorgio Palla

References

Calle ML, Urrea V, Boulesteix A-L, Malats N (2011) 'AUC-RF: A new strategy for genomic pro- filing with Random Forest'. Human Heredity

Examples

1
2
## data(cachexiaData)
## aucMCV(cachexiaData, ref_level = 'control')

RFmarkerDetector documentation built on May 2, 2019, 3:42 p.m.