plsrf_xz_pv: Classification based on pre-validated PLS dimension reduction...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/plsrf_xz_pv.r

Description

This function builds a prediction rule based on the learning data (both clinical and microarray predictors) and applies it to the test data. The classifier consists of two steps: PLS dimension reduction involving a pre-validation step for summarizing microarray data, and random forests applied to both PLS components and clinical predictors. See Boulesteix et al (2008) for more details.

The function plsrf_xz_pv uses the functions cforest and varimp from the package party and the function pls.regression from the package plsgenomics.

Usage

1
2
plsrf_xz_pv(Xlearn,Zlearn,Ylearn,Xtest,Ztest,ncomp=0:3,
ordered=NULL,nbgene=NULL,fold=10,...)

Arguments

Xlearn

A nlearn x p matrix giving the microarray predictors for the learning data set.

Zlearn

A nlearn x q matrix giving the clinical predictors for the learning data set.

Ylearn

A numeric vector of length nlearn giving the class membership of the learning observations, coded as 0,...,K-1 (where K is the number of classes).

Xtest

A ntest x p matrix giving the microarray predictors for the test data set.

Ztest

A ntest x q matrix giving the clinical predictors for the test data set.

ncomp

A numeric vector giving the candidate numbers of pre-validated PLS components. All numbers must be >=0. The number 0 corresponds to prediction based on clinical parameters only.

ordered

A vector of length p giving the order of the microarray predictors in terms of relevance for prediction. For instance, if the three first elements of ordered are 30,2,2400, it means that the most relevant genes are the genes in the 30th, 2nd and 2400th columns of the gene expression data matrix Xlearn. Note: if ordered=NULL, the columns of Xlearn and Xtest are assumed to be already ordered.

nbgene

The number of genes to be selected for use in dimension reduction. Default is nbgene=NULL, in which case all genes are used.

fold

The number of folds for the pre-validation step. See Boulesteix et al (2008) for more details. The default is fold=10.

...

Other arguments to be passed to the function cforest_control from the party package.

Details

See Boulesteix et al (2008).

Value

A list with the elements:

prediction

A numeric vector of length nrow(Xtest) giving the predicted class for each observation from the test data set.

importance

The variable importance information output by the function varimp from the package party for the corresponding forest.

bestncomp

The best number of pre-validated PLS components, as obtained using the model selection method based on the out-of-bag error.

OOB

A numeric vector of length ncomp giving the out-of-bag error of the forest constructed with the corresponding number of pre-validated PLS components.

Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html)

References

Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.

Tibshirani R, Efron B, 2002. Pre-validation and inference in microarrays. Stat. Appl. Genet. Mol. Biol. 1:1.

See Also

testclass, testclass_simul, simulate, plsrf_x, plsrf_x_pv, plsrf_xz, rf_z, logistic_z, svm_x.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# load MAclinical library
# library(MAclinical)

# Generating xlearn, zlearn, ylearn, xtest, ztest
xlearn<-matrix(rnorm(3000),30,100)
zlearn<-matrix(rnorm(120),30,4)
ylearn<-sample(0:1,30,replace=TRUE)
xtest<-matrix(rnorm(2000),20,100)
ztest<-matrix(rnorm(80),20,4)

my.prediction1<-plsrf_xz_pv(Xlearn=xlearn,Zlearn=zlearn,Ylearn=ylearn,
Xtest=xtest,Ztest=ztest)

ordered<-sample(100)
my.prediction2<-plsrf_xz_pv(Xlearn=xlearn,Zlearn=zlearn,Ylearn=ylearn,
Xtest=xtest,Ztest=ztest,ordered=ordered,nbgene=20)
my.prediction3<-plsrf_xz_pv(Xlearn=xlearn,Zlearn=zlearn,Ylearn=ylearn,
Xtest=xtest,Ztest=ztest,fold=30)

MAclinical documentation built on May 2, 2019, 9:30 a.m.