pls.regression.cv: Determination of the number of latent components to be used...
In plsgenomics: PLS Analyses for Genomics

pls.regression.cv

R Documentation

Determination of the number of latent components to be used in PLS regression

Description

The function pls.regression.cv determines the best number of latent components to be used for PLS regression using the cross-validation approach described in Boulesteix and Strimmer (2005).

Usage

pls.regression.cv(Xtrain, Ytrain, ncomp, nruncv=20, alpha=2/3)

Arguments

`Xtrain`	a (ntrain x p) data matrix containing the predictors for the training data set. `Xtrain` may be a matrix or a data frame. Each row is an observation and each column is a predictor variable.
`Ytrain`	a (ntrain x m) data matrix of responses. `Ytrain` may be a vector (if m=1), a matrix or a data frame. If `Ytrain` is a matrix or a data frame, each row is an observation and each column is a response variable. If `Ytrain` is a vector, it contains the unique response variable for each observation.
`ncomp`	the vector of integers from which the best number of latent components has to be chosen by cross-validation. If `ncomp` is of length 1, the best number of components is chosen from 1,...,`ncomp`.
`nruncv`	the number of cross-validation iterations to be performed for the choice of the number of latent components.
`alpha`	the proportion of observations to be included in the training set at each cross-validation iteration.

Details

The cross-validation procedure described in Boulesteix and Strimmer (2005) is used to determine the best number of latent components to be used for classification. At each cross-validation run, Xtrain is split into a pseudo training set and a pseudo test set and the squared error is determined for each number of latent components. Finally, the function pls.regression.cv returns the number of latent components for which the mean squared error over the nrun partitions is minimal.

Value

The number of latent components to be used in PLS regression, as determined by cross-validation.

Author(s)

Anne-Laure Boulesteix (https://www.ibe.med.uni-muenchen.de/mitarbeiter/professoren/boulesteix/index.html) and Korbinian Strimmer (https://strimmerlab.github.io/korbinian.html).

References

A. L. Boulesteix and K. Strimmer (2005). Predicting Transcription Factor Activities from Combined Analysis of Microarray and ChIP Data: A Partial Least Squares Approach.

A. L. Boulesteix, K. Strimmer (2007). Partial least squares: a versatile tool for the analysis of high-dimensional genomic data. Briefings in Bioinformatics 7:32-44.

S. de Jong (1993). SIMPLS: an alternative approach to partial least squares regression, Chemometrics Intell. Lab. Syst. 18, 251–263.

Examples

## Not run: 
## between 5~15 seconds
# load plsgenomics library
library(plsgenomics)

# load Ecoli data
data(Ecoli)

# determine the best number of components for PLS regression using the cross-validation approach
# choose the best number from 1,2,3,4
pls.regression.cv(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,ncomp=4,nruncv=20)
# choose the best number from 2,3
pls.regression.cv(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,ncomp=c(2,3),nruncv=20)


## End(Not run)

plsgenomics documentation built on June 22, 2024, 7:30 p.m.

plsgenomics index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

plsgenomics
PLS Analyses for Genomics

pls.regression.cv: Determination of the number of latent components to be used...
In plsgenomics: PLS Analyses for Genomics

Determination of the number of latent components to be used in PLS regression

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to pls.regression.cv in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics PLS Analyses for Genomics

pls.regression.cv: Determination of the number of latent components to be used... In plsgenomics: PLS Analyses for Genomics

Determination of the number of latent components to be used in PLS regression

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to pls.regression.cv in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics
PLS Analyses for Genomics

pls.regression.cv: Determination of the number of latent components to be used...
In plsgenomics: PLS Analyses for Genomics