gsim.cv: Determination of the ridge regularization parameter and the...
In plsgenomics: PLS Analyses for Genomics

gsim.cv

R Documentation

Determination of the ridge regularization parameter and the bandwidth to be used for classification with GSIM for binary data

Description

The function gsim.cv determines the best ridge regularization parameter and bandwidth to be used for classification with GSIM as described in Lambert-Lacroix and Peyre (2005).

Usage

gsim.cv(Xtrain, Ytrain,LambdaRange,hARange,hB=NULL, NbIterMax=50)

Arguments

`Xtrain`	a (ntrain x p) data matrix of predictors. `Xtrain` must be a matrix. Each row corresponds to an observation and each column to a predictor variable.
`Ytrain`	a ntrain vector of responses. `Ytrain` must be a vector. `Ytrain` is a {1,2}-valued vector and contains the response variable for each observation.
`LambdaRange`	the vector of positive real value from which the best ridge regularization parameter has to be chosen by cross-validation.
`hARange`	the vector of strictly positive real value from which the best bandwidth has to be chosen by cross-validation for GSIM step A.
`hB`	a strictly positive real value. `hB` is the bandwidth for GSIM step B. if `hB` is equal to NULL, then hB value is chosen using a plug-in method.
`NbIterMax`	a positive integer. `NbIterMax` is the maximal number of iterations in the Newton-Rapson parts.

Details

The cross-validation procedure described in Lambert-Lacroix and Peyre (2005) is used to determine the best ridge regularization parameter and bandwidth to be used for classification with GSIM for binary data (for categorical data see mgsim and mgsim.cv). At each cross-validation run, Xtrain is split into a pseudo training set (ntrain - 1 samples) and a pseudo test set (1 sample) and the classification error rate is determined for each value of ridge regularization parameter and bandwidth. Finally, the function gsim.cv returns the values of the ridge regularization parameter and bandwidth for which the mean classification error rate is minimal.

Value

A list with the following components:

`Lambda`	the optimal regularization parameter.
`hA`	the optimal bandwidth parameter.

Author(s)

Sophie Lambert-Lacroix (http://membres-timc.imag.fr/Sophie.Lambert/) and Julie Peyre (https://membres-ljk.imag.fr/Julie.Peyre/).

References

S. Lambert-Lacroix, J. Peyre . (2006) Local likelyhood regression in generalized linear single-index models with applications to microarrays data. Computational Statistics and Data Analysis, vol 51, n 3, 2091-2113.

Examples

## Not run: 
## between 5~15 seconds
# load plsgenomics library
library(plsgenomics)

# load Colon data
data(Colon)
IndexLearn <- c(sample(which(Colon$Y==2),12),sample(which(Colon$Y==1),8))

Xtrain <- Colon$X[IndexLearn,]
Ytrain <- Colon$Y[IndexLearn]
Xtest <- Colon$X[-IndexLearn,]

# preprocess data
resP <- preprocess(Xtrain= Xtrain, Xtest=Xtest,Threshold = c(100,16000),Filtering=c(5,500),
				log10.scale=TRUE,row.stand=TRUE)

# Determine optimum h and lambda
hl <- gsim.cv(Xtrain=resP$pXtrain,Ytrain=Ytrain,hARange=c(7,20),LambdaRange=c(0.1,1),hB=NULL)

# perform prediction by GSIM  
res <- gsim(Xtrain=resP$pXtrain,Ytrain=Ytrain,Xtest=resP$pXtest,Lambda=hl$Lambda,hA=hl$hA,hB=NULL)
res$Cvg
sum(res$Ytest!=Colon$Y[-IndexLearn])

## End(Not run)

plsgenomics documentation built on June 22, 2024, 7:30 p.m.

plsgenomics index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

plsgenomics
PLS Analyses for Genomics

gsim.cv: Determination of the ridge regularization parameter and the...
In plsgenomics: PLS Analyses for Genomics

Determination of the ridge regularization parameter and the bandwidth to be used for classification with GSIM for binary data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to gsim.cv in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics PLS Analyses for Genomics

gsim.cv: Determination of the ridge regularization parameter and the... In plsgenomics: PLS Analyses for Genomics

Determination of the ridge regularization parameter and the bandwidth to be used for classification with GSIM for binary data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to gsim.cv in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics
PLS Analyses for Genomics

gsim.cv: Determination of the ridge regularization parameter and the...
In plsgenomics: PLS Analyses for Genomics