mgsim.cv: Determination of the ridge regularization parameter and the...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

The function mgsim.cv determines the best ridge regularization parameter and bandwidth to be used for classification with MGSIM as described in Lambert-Lacroix and Peyre (2005).

Usage

1
mgsim.cv(Ytrain,Xtrain,LambdaRange,hRange,NbIterMax=50)

Arguments

Xtrain

a (ntrain x p) data matrix of predictors. Xtrain must be a matrix. Each row corresponds to an observation and each column to a predictor variable.

Ytrain

a ntrain vector of responses. Ytrain must be a vector. Ytrain is a {1,...,c+1}-valued vector and contains the response variable for each observation. c+1 is the number of classes.

LambdaRange

the vector of positive real value from which the best ridge regularization parameter has to be chosen by cross-validation.

hRange

the vector of strictly positive real value from which the best bandwidth has to be chosen by cross-validation.

NbIterMax

a positive integer. NbIterMax is the maximal number of iterations in the Newton-Rapson parts.

Details

The cross-validation procedure described in Lambert-Lacroix and Peyre (2005) is used to determine the best ridge regularization parameter and bandwidth to be used for classification with GSIM for categorical data (for binary data see gsim and gsim.cv). At each cross-validation run, Xtrain is split into a pseudo training set (ntrain-1 samples) and a pseudo test set (1 sample) and the classification error rate is determined for each value of ridge regularization parameter and bandwidth. Finally, the function mgsim.cv returns the values of the ridge regularization parameter and bandwidth for which the mean classification error rate is minimal.

Value

A list with the following components:

Lambda

the optimal regularization parameter.

h

the optimal bandwidth parameter.

Author(s)

Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/) and Julie Peyre (http://www-lmc.imag.fr/lmc-sms/Julie.Peyre/).

References

S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.

See Also

mgsim, gsim, gsim.cv.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# load plsgenomics library
library(plsgenomics)

# load SRBCT data
data(SRBCT)
IndexLearn <- c(sample(which(SRBCT$Y==1),10),sample(which(SRBCT$Y==2),4),
                sample(which(SRBCT$Y==3),7),sample(which(SRBCT$Y==4),9))

### Determine optimum h and lambda
# /!\ take 30 secondes to run
#hl <- mgsim.cv(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],
#                            LambdaRange=c(0.1),hRange=c(7,20))

### perform prediction by MGSIM
#res <- mgsim(Ytrain=SRBCT$Y[IndexLearn],Xtrain=SRBCT$X[IndexLearn,],Lambda=hl$Lambda,
#             h=hl$h,Xtest=SRBCT$X[-IndexLearn,])
#res$Cvg
#sum(res$Ytest!=SRBCT$Y[-IndexLearn])


Search within the plsgenomics package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.