pls.regression: Multivariate Partial Least Squares Regression
In plsgenomics: PLS Analyses for Genomics

Description Usage Arguments Details Value Author(s) References See Also Examples

The function pls.regression performs pls multivariate regression (with several response variables and several predictor variables) using de Jong's SIMPLS algorithm. This function is an adaptation of R. Wehrens' code from the package pls.pcr.

1	pls.regression(Xtrain, Ytrain, Xtest=NULL, ncomp=NULL, unit.weights=TRUE)

`Xtrain`	a (ntrain x p) data matrix of predictors. `Xtrain` may be a matrix or a data frame. Each row corresponds to an observation and each column to a predictor variable.
`Ytrain`	a (ntrain x m) data matrix of responses. `Ytrain` may be a vector (if m=1), a matrix or a data frame. If `Ytrain` is a matrix or a data frame, each row corresponds to an observation and each column to a response variable. If `Ytrain` is a vector, it contains the unique response variable for each observation.
`Xtest`	a (ntest x p) matrix containing the predictors for the test data set. `Xtest` may also be a vector of length p (corresponding to only one test observation).
`ncomp`	the number of latent components to be used for regression. If `ncomp` is a vector of integers, the regression model is built successively with each number of components. If `ncomp=NULL`, the maximal number of components min(ntrain,p) is chosen.
`unit.weights`	if `TRUE` then the latent components will be constructed from weight vectors that are standardized to length 1, otherwise the weight vectors do not have length 1 but the latent components have norm 1.

The columns of the data matrices Xtrain and Ytrain must not be centered to have mean zero, since centering is performed by the function pls.regression as a preliminary step before the SIMPLS algorithm is run.

In the original definition of SIMPLS by de Jong (1993), the weight vectors have length 1. If the weight vectors are standardized to have length 1, they satisfy a simple optimality criterion (de Jong, 1993). However, it is also usual (and computationally efficient) to standardize the latent components to have length 1.

In contrast to the original version found in the package pls.pcr, the prediction for the observations from Xtest is performed after centering the columns of Xtest by substracting the columns means calculated from Xtrain.

A list with the following components:

`B`	the (p x m x length(`ncomp`)) matrix containing the regression coefficients. Each row corresponds to a predictor variable and each column to a response variable. The third dimension of the matrix `B` corresponds to the number of PLS components used to compute the regression coefficients. If `ncomp` has length 1, `B` is just a (p x m) matrix.
`Ypred`	the (ntest x m x length(`ncomp`)) containing the predicted values of the response variables for the observations from `Xtest`. The third dimension of the matrix `Ypred` corresponds to the number of PLS components used to compute the regression coefficients.
`P`	the (p x max(`ncomp`)) matrix containing the X-loadings.
`Q`	the (m x max(`ncomp`)) matrix containing the Y-loadings.
`T`	the (ntrain x max(`ncomp`)) matrix containing the X-scores (latent components)
`R`	the (p x max(`ncomp`)) matrix containing the weights used to construct the latent components.
`meanX`	the p-vector containing the means of the columns of `Xtrain`.

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/ 020_professuren/boulesteix/index.html) and Korbinian Strimmer (http://strimmerlab.org/).

Adapted in part from pls.pcr code by R. Wehrens (in a former version of the 'pls' package http://cran.r-project.org/web/packages/pls/index.html).

S. de Jong (1993). SIMPLS: an alternative approach to partial least squares regression, Chemometrics Intell. Lab. Syst. 18, 251–263.

C. J. F. ter Braak and S. de Jong (1993). The objective function of partial least squares regression, Journal of Chemometrics 12, 41–54.

pls.lda, TFA.estimate, pls.regression.cv.

# load plsgenomics library
library(plsgenomics)

# load the Ecoli data
data(Ecoli)

# perform pls regression
# with unit latent components
pls.regression(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,Xtest=Ecoli$CONNECdata,
			ncomp=1:3,unit.weights=FALSE)

# with unit weight vectors
pls.regression(Xtrain=Ecoli$CONNECdata,Ytrain=Ecoli$GEdata,Xtest=Ecoli$CONNECdata,
			ncomp=1:3,unit.weights=TRUE)