# Partial Least Squares Regression

### Description

Functions to perform partial least squares regression with a formula interface. Bootstraping can be used. Prediction, residuals, model extraction, plot, print and summary methods are also implemented.

### Usage

1 2 3 4 5 6 7 | ```
plsFit(formula, ncomp, data, subset, na.action, contr = "contr.niets",
method = "bidiagpls", scale = TRUE, n_cores = 2,
validation = c("none", "oob", "loo"), boots = 1000, model = TRUE,
x = FALSE, y = FALSE, ...)
## S3 method for class 'mvdareg'
summary(object, ncomp = object$ncomp, digits = 3, ...)
``` |

### Arguments

`formula` |
a model formula (see below). |

`ncomp` |
the number of components to include in the model (see below). |

`data` |
an optional data frame containing the variables in the model. |

`subset` |
an optional vector specifying a subset of observations to be used in the fitting process. |

`na.action` |
a function which indicates what should happen when the data contain |

`contr` |
an optional list. See the |

`method` |
the multivariate regression algorithm to be used. |

`scale` |
an optional data frame containing the variables in the model. |

`n_cores` |
Number of cores to run for parallel processing. Currently set to 2 with the max being 4. |

`validation` |
character. What kind of (internal) validation to use. See below. |

`boots` |
Number of bootstrap samples when |

`model` |
an optional data frame containing the variables in the model. |

`x` |
a logical. If TRUE, the model matrix is returned. |

`y` |
a logical. If TRUE, the response is returned. |

`object` |
an object of class |

`digits` |
the number of decimal place to output with |

`...` |
additional arguments, passed to the underlying fit functions, and |

### Details

The function fits a partial least squares (PLS) model with 1, ..., `ncomp`

number of latent variables. Multi-response models are not supported.

The type of model to fit is specified with the method argument. Currently one PLS algorithm is available: the bigiag2 algorithm ("bigiagpls").

The formula argument should be a symbolic formula of the form response ~ terms, where response is the name of the response vector and terms is the name of one or more predictor matrices, usually separated by +, e.g., y ~ X + Z. See `lm`

for a detailed description. The named variables should exist in the supplied data data frame or in the global environment. The chapter Statistical models in R of the manual An Introduction to R distributed with R is a good reference on formulas in R.

The number of components to fit is specified with the argument `ncomp`

. It this is not supplied, the maximal number of components is used.

If `validation = "oob"`

, bootstrap cross-validation is performed. Bootstrap confidence intervals are provided for `coefficients`

, `weights`

, `loadings`

, and `y.loadings`

. The number of bootstrap samples is specified with the argument `boots`

. See `mvdaboot`

for details. If `validation = "loo"`

, leave-one-out cross-validation is performed. If `validation = "none"`

, no cross-validation is performed.

The argument `contr`

is passed to the default `contr.niets`

; `contr.helmert`

, `contr.poly`

, `contr.sum`

, `contr.treatment`

are also supported.

### Value

An object of class `mvdareg`

is returned. The object contains all components returned by the underlying fit function. In addition, it contains the following:

`loadings` |
X loadings |

`weights` |
weights |

`D2.values` |
bidiag2 matrix |

`iD2` |
inverse of bidiag2 matrix |

`Ymean` |
mean of reponse variable |

`Xmeans` |
mean of predictor variables |

`coefficients` |
PLS regression coefficients |

`y.loadings` |
y-loadings |

`scores` |
X scores |

`R` |
orthogonal weights |

`Y.values` |
scaled response values |

`Yactual` |
actual response values |

`fitted` |
fitted values |

`residuals` |
residuals |

`Xdata` |
X matrix |

`iPreds` |
predicted values |

`y.loadings2` |
scaled y-loadings |

`ncomp` |
number of latent variables |

`method` |
PLS algorithm used |

`scale` |
scaling used |

`validation` |
validation method |

`call` |
model call |

`terms` |
model terms |

`model` |
fitted model |

### Author(s)

Nelson Lee Afanador (nelson.afanador@mvdalab.com), Thanh Tran (thanh.tran@mvdalab.com)

### References

NOTE: This function is adapted from `mvr`

in package pls with extensive modifications by Nelson Lee Afanador and Thanh Tran.

### See Also

`bidiagpls.fit`

, `mvdaboot`

, `boot.plots`

,
`R2s`

, `PE`

, `ap.plot`

,
`T2`

, `Xresids`

, `smc`

,
`scoresplot`

, `ScoreContrib`

, `sr`

,
`loadingsplot`

, `weightsplot`

, `coefsplot`

,
`loadingsplot2D`

, `weightsplot2D`

, `vip`

,
`bca.cis`

, `coefficients.boots`

, `loadings.boots`

,
`weight.boots`

, `coefficients`

, `loadings`

,
`weights`

, `BiPlot`

, `jk.after.boot`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | ```
### PLS MODEL FIT WITH validation = 'oob', i.e. bootstrapping ###
data(Penta)
## Number of bootstraps set to 500 to demonstrate flexibility
## Use a minimum of 1000 (default) for results that support bootstraping
mod1 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1],
ncomp = 2, validation = "oob", boots = 500)
summary(mod1) #Model summary
### PLS MODEL FIT WITH validation = 'loo', i.e. leave-one-out CV ###
mod2 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1],
ncomp = 2, validation = "loo")
summary(mod2) #Model summary
### PLS MODEL FIT WITH validation = 'none', i.e. no CV ###
mod3 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1],
ncomp = 2, contr = "contr.niets", validation = "none")
summary(mod3) #Model summary
``` |