mrpls.cv: Determination of the ridge regularization parameter and the...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

The function mrpls.cv determines the best ridge regularization parameter and the best number of PLS components to be used for classification for Fort et al. (2005) MRPLS algorithm.

Usage

1
mrpls.cv(Ytrain,Xtrain,LambdaRange,ncompMax,NbIterMax=50)

Arguments

Xtrain

a (ntrain x p) data matrix of predictors. Xtrain must be a matrix. Each row corresponds to an observation and each column to a predictor variable.

Ytrain

a ntrain vector of responses. Ytrain must be a vector. Ytrain is a {1,...,c+1}-valued vector and contains the response variable for each observation. c+1 is the number of classes.

LambdaRange

the vector of positive real value from which the best ridge regularization parameter has to be chosen by cross-validation.

ncompMax

a positive integer. The best number of components is chosen from 1,...,ncompMax. If ncompMax=0,then the Ridge regression is performed without reduction dimension.

NbIterMax

a positive integer. NbIterMax is the maximal number of iterations in the Newton-Rapson parts.

Details

A cross-validation procedure is used to determine the best ridge regularization parameter and number of PLS components to be used for classification with MRPLS for categorical data (for binary data see rpls and rpls.cv). At each cross-validation run, Xtrain is split into a pseudo training set (ntrain-1 samples) and a pseudo test set (1 sample) and the classification error rate is determined for each value of ridge regularization parameter and number of components. Finally, the function mrpls.cv returns the values of the ridge regularization parameter and bandwidth for which the mean classification error rate is minimal.

Value

A list with the following components:

Lambda

the optimal regularization parameter.

ncomp

the optimal number of PLS components.

Author(s)

Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/).

References

G. Fort, S. Lambert-Lacroix and Julie Peyre (2005). Reduction de dimension dans les modeles lineaires generalises : application a la classification supervisee de donnees issues des biopuces. Journal de la SFDS, tome 146, n1-2, 117-152.

See Also

mrpls, rpls, rpls.cv.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# load plsgenomics library
# load plsgenomics library
library(plsgenomics)

# load SRBCT data
data(SRBCT)
IndexLearn <- c(sample(which(SRBCT$Y==1),10),sample(which(SRBCT$Y==2),4),
			sample(which(SRBCT$Y==3),7),sample(which(SRBCT$Y==4),9))

# Determine optimum ncomp and Lambda
nl <- mrpls.cv(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],
			LambdaRange=c(0.1,1),ncompMax=3)

# perform prediction by MRPLS
res <- mrpls(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],Lambda=nl$Lambda,
			ncomp=nl$ncomp,Xtest=SRBCT$X[-IndexLearn,])
sum(res$Ytest!=SRBCT$Y[-IndexLearn])

plsgenomics documentation built on May 2, 2019, 4:51 p.m.