testclass_simul: Evaluating a classification method based on simulated data

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/testclass_simul.r

Description

This function evaluates classifiers built using microarray data and/or clinical predictors, based on simulated data generated using the functions simuldata_list and simuldatacluster_list (see simulate).

Usage

1
2
testclass_simul(datalist,nlearn=100,classifier,ncomp=0:3,nbgene=NULL,
varsel=NULL,fold=10,...)

Arguments

datalist

A list of niter simulated data sets as generated by the functions simuldata_list and simuldatacluster_list (see simulate).

nlearn

The number of observations to be included in the learning data set. It must be smaller than the total number of observations of the data sets.

classifier

The function used to construct a classifier. The function must have the same structure as plsrf_xz_pv.

ncomp

The candidate numbers of PLS components (if PLS dimension reduction is used).

nbgene

The number of genes to use for classifier construction. Default is nbgene=NULL, corresponding to all genes.

varsel

A niter x p matrix giving the indices of the genes ordered by the chosen gene selection criterion. For example, the element in the first row and the first column is the index of the gene that is ranked best using in the first simulation iteration.

fold

The number of folds for the pre-validation step, if any. See Boulesteix et al (2008) for more details. Default is fold=10.

...

Other arguments to be passed to the function cforest_control from the party package or to the function svm from the package e1071, depending on the specified classifier.

Details

See Boulesteix et al (2008).

Value

error

A numeric vector of length niter giving the misclassification rate for each iteration.

bestncomp

A numeric vector of length niter giving the best number of (pre-validated) PLS components, as obtained using the model selection method based on the out-of-bag error by Boulesteix et al (returned only for the classifiers plsrf_xz_pv, plsrf_xz, plsrf_x_pv, plsrf_x).

OOB

A list of length niter, whose elements are numeric vectors of the same length as ncomp giving the out-of-bag error of the forest constructed with the corresponding number of (pre-validated) PLS components (returned only for the classifiers plsrf_xz_pv, plsrf_xz, plsrf_x_pv, plsrf_x, rf_z. For rf_z, no model selection is performed: OOB is just the out-of-bag error of the constructed forest.)

Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html)

References

Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.

See Also

testclass, plsrf_xz_pv, simulate, plsrf_xz_pv, plsrf_x_pv, plsrf_xz, plsrf_x, rf_z, svm_x, logistic_z.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# load MAclinical library
# library(MAclinical)

# Generating 3 simulated data sets
my.data<-simuldata_list(niter=3,n=100,p=150,psig=10,q=5,muX=2,muZ=1)

# Perform prediction of the 60 last observations using the first 40 observations, 
# based on PLS (without pre-validation) and random forests

testclass_simul(my.data,nlearn=40,classifier=plsrf_xz)

MAclinical documentation built on May 2, 2019, 9:30 a.m.