plsFit: Partial Least Squares Regression

View source: R/plsFit.R

plsFitR Documentation

Partial Least Squares Regression

Description

Functions to perform partial least squares regression with a formula interface. Bootstraping can be used. Prediction, residuals, model extraction, plot, print and summary methods are also implemented.

Usage

plsFit(formula, data, subset, ncomp = NULL, na.action, 
method = c("bidiagpls", "wrtpls"), scale = TRUE, n_cores = 2, 
alpha = 0.05, perms = 2000, validation = c("none", "oob", "loo"), 
boots = 1000, model = TRUE, parallel = FALSE,
x = FALSE, y = FALSE, ...) 
## S3 method for class 'mvdareg'
summary(object, ncomp = object$ncomp, digits = 3, ...)

Arguments

formula

a model formula (see below).

data

an optional data frame containing the variables in the model.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

ncomp

the number of components to include in the model (see below).

na.action

a function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options, and is na.fail if that is unset. The default is na.omit. Another possible value is NULL, no action. Value na.exclude can be useful.

method

the multivariate regression algorithm to be used.

scale

should scaling to unit variance be used.

n_cores

Number of cores to run for parallel processing. Currently set to 2 with the max being 4.

alpha

the significance level for wrtpls

perms

the number of permutations to run for wrtpls

validation

character. What kind of (internal) validation to use. See below.

boots

Number of bootstrap samples when validation = 'oob'

model

an optional data frame containing the variables in the model.

parallel

should parallelization be used.

x

a logical. If TRUE, the model matrix is returned.

y

a logical. If TRUE, the response is returned.

object

an object of class "mvdareg", i.e., a fitted model.

digits

the number of decimal place to output with summary.mvdareg

...

additional arguments, passed to the underlying fit functions, and mvdareg. Currently not in use.

Details

The function fits a partial least squares (PLS) model with 1, ..., ncomp number of latent variables. Multi-response models are not supported.

The type of model to fit is specified with the method argument. Currently two PLS algorithms are available: the bigiag2 algorithm ("bigiagpls" and "wrtpls").

The formula argument should be a symbolic formula of the form response ~ terms, where response is the name of the response vector and terms is the name of one or more predictor matrices, usually separated by +, e.g., y ~ X + Z. See lm for a detailed description. The named variables should exist in the supplied data data frame or in the global environment. The chapter Statistical models in R of the manual An Introduction to R distributed with R is a good reference on formulas in R.

The number of components to fit is specified with the argument ncomp. It this is not supplied, the maximal number of components is used.

Note that if the number of samples is <= 15, oob validation may fail. It is recommended that you PLS with validation = "loo".

If method = "bidiagpls" and validation = "oob", bootstrap cross-validation is performed. Bootstrap confidence intervals are provided for coefficients, weights, loadings, and y.loadings. The number of bootstrap samples is specified with the argument boots. See mvdaboot for details.

If method = "bidiagpls" and validation = "loo", leave-one-out cross-validation is performed.

If method = "bidiagpls" and validation = "none", no cross-validation is performed. Note that the number of components, ncomp, is set to min(nobj - 1, npred)

If method = "wrtpls" and validation = "none", The Weight Randomization Test for the selection of the number of components is performed. Note that the number of components, ncomp, is set to min(nobj - 1, npred)

Value

An object of class mvdareg is returned. The object contains all components returned by the underlying fit function. In addition, it contains the following:

loadings

X loadings

weights

weights

D2.values

bidiag2 matrix

iD2

inverse of bidiag2 matrix

Ymean

mean of reponse variable

Xmeans

mean of predictor variables

coefficients

PLS regression coefficients

y.loadings

y-loadings

scores

X scores

R

orthogonal weights

Y.values

scaled response values

Yactual

actual response values

fitted

fitted values

residuals

residuals

Xdata

X matrix

iPreds

predicted values

y.loadings2

scaled y-loadings

ncomp

number of latent variables

method

PLS algorithm used

scale

scaling used

validation

validation method

call

model call

terms

model terms

model

fitted model

Author(s)

Nelson Lee Afanador (nelson.afanador@mvdalab.com), Thanh Tran (thanh.tran@mvdalab.com)

References

NOTE: This function is adapted from mvr in package pls with extensive modifications by Nelson Lee Afanador and Thanh Tran.

See Also

bidiagpls.fit, mvdaboot, boot.plots, R2s, PE, ap.plot, T2, Xresids, smc, scoresplot, ScoreContrib, sr, loadingsplot, weightsplot, coefsplot, coefficientsplot2D, loadingsplot2D, weightsplot2D, bca.cis, coefficients.boots, loadings.boots, weight.boots, coefficients, loadings, weights, BiPlot, jk.after.boot

Examples

###  PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'oob', i.e. bootstrapping ###
data(Penta)
## Number of bootstraps set to 300 to demonstrate flexibility
## Use a minimum of 1000 (default) for results that support bootstraping
mod1 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
               ncomp = 2, validation = "oob", boots = 300)
summary(mod1) #Model summary

###  PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'loo', i.e. leave-one-out CV ###
## Not run: 
mod2 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
               ncomp = 2, validation = "loo")
summary(mod2) #Model summary

## End(Not run)

###  PLS MODEL FIT WITH method = 'bidiagpls' and validation = 'none', i.e. no CV is performed ###
## Not run: 
mod3 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1], method = "bidiagpls",
               ncomp = 2, validation = "none")
summary(mod3) #Model summary

## End(Not run)
###  PLS MODEL FIT WITH method = 'wrtpls' and validation = 'none', i.e. WRT-PLS is performed ###
## Not run: 
mod4 <- plsFit(log.RAI ~., scale = TRUE, data = Penta[, -1],
               method = "wrtpls", validation = "none")
summary(mod4) #Model summary
plot.wrtpls(mod4)

## End(Not run)

mvdalab documentation built on Oct. 6, 2022, 1:05 a.m.