pls | R Documentation |
Partial least squares (PLS), also called projection to latent structures, performs multivariate regression between a data matrix and a response matrix by decomposing both matrixes in a way that explains the maximum amount of covariation between them. It is especially useful when the number of predictors is greater than the number of observations, or when the predictors are highly correlated. Orthogonal partial least squares (OPLS) is also provided.
# NIPALS algorithm
pls_nipals(x, y, k = 3L, center = TRUE, scale. = FALSE,
transpose = FALSE, niter = 100L, tol = 1e-5,
verbose = NA, BPPARAM = bpparam(), ...)
# SIMPLS algorithm
pls_simpls(x, y, k = 3L, center = TRUE, scale. = FALSE,
transpose = FALSE, method = 1L, retscores = TRUE,
verbose = NA, BPPARAM = bpparam(), ...)
# Kernel algorithm
pls_kernel(x, y, k = 3L, center = TRUE, scale. = FALSE,
transpose = FALSE, method = 1L, retscores = TRUE,
verbose = NA, BPPARAM = bpparam(), ...)
## S3 method for class 'pls'
fitted(object, type = c("response", "class"), ...)
## S3 method for class 'pls'
predict(object, newdata, k,
type = c("response", "class"), simplify = TRUE, ...)
# O-PLS algorithm
opls_nipals(x, y, k = 3L, center = TRUE, scale. = FALSE,
transpose = FALSE, niter = 100L, tol = 1e-9, regression = TRUE,
verbose = NA, BPPARAM = bpparam(), ...)
## S3 method for class 'opls'
coef(object, ...)
## S3 method for class 'opls'
residuals(object, ...)
## S3 method for class 'opls'
fitted(object, type = c("response", "class", "x"), ...)
## S3 method for class 'opls'
predict(object, newdata, k,
type = c("response", "class", "x"), simplify = TRUE, ...)
# Variable importance in projection
vip(object, type = c("projection", "weights"))
x |
The data matrix of predictors. |
y |
The response matrix. (Can also be a factor.) |
k |
The number of PLS components to use. (Can be a vector for the |
center |
A logical value indicating whether the variables should be shifted to be zero-centered, or a centering vector of length equal to the number of columns of |
scale. |
A logical value indicating whether the variables should be scaled to have unit variance, or a scaling vector of length equal to the number of columns of |
transpose |
A logical value indicating whether |
niter |
The maximum number of iterations (per component). |
tol |
The tolerance for convergence (per component). |
verbose |
Should progress be printed for each iteration? |
method |
The kernel algorithm to use, where |
retscores |
Should the scores be computed and returned? This also computes the amount of explained covariance for each component. This is done automatically for NIPALS, but requires additional computation for the kernel algorithms. |
regression |
For O-PLS, should a 1-component PLS regression be fit to the processed data (for each orthogonal component removed). |
... |
Not currently used. |
BPPARAM |
An optional instance of |
object |
An object inheriting from |
newdata |
An optional data matrix to use for the prediction. |
type |
The type of prediction, where |
simplify |
Should the predictions be simplified (from a list) to an array ( |
These functions implement partial least squares (PLS) using the original NIPALS algorithm by Wold et al. (1983), the SIMPLS algorithm by de Jong (1993), or the kernel algorithms by Dayal and MacGregor (1997). A function for calculating orthogonal partial least squares (OPLS) processing using the NIPALS algorithm by Trygg and Wold (2002) is also provided.
Both regression and classification can be performed. If passed a factor
, then partial least squares discriminant analysis (PLS-DA) will be performed as described by M. Barker and W. Rayens (2003).
The SIMPLS algorithm (pls_simpls()
) is relatively fast as it does not require the deflation of the data matrix. However, the results will differ slightly from the NIPALS and kernel algorithms for multivariate responses. In these cases, only the first component will be identical. The differences are not meaningful in most cases, but it is worth noting.
The kernel algorithms (pls_kernel()
) tend to be faster than NIPALS for larger data matrices. The original NIPALS algorithm (pls_nipals()
) is the reference implementation. The results from these algorithms are proven to be equivalent for both univariate and multivariate responses.
Note that the NIPALS algorithms cannot handle out-of-memory matter_mat
and sparse_mat
matrices due to the need to deflate the data matrix for each component. x
will be coerced to an in-memory matrix.
Variable importance in projection (VIP) scores proposed by Wold et al. (1993) measure of the influence each variable has on the PLS model. They can be calculated with vip()
. Note that non-NIPALS models must have retscores = TRUE
for VIP to be calculated. In practice, a VIP score greater than ~1 is a useful criterion for variable selection, although there is no statistical basis for this rule.
An object of class pls
, with the following components:
coefficients
: The regression coefficients.
projection
: The projection weights of the regression used to calculate the coefficients from the y-loadings or to project the data to the scores.
residuals
: The residuals from regression.
fitted.values
: The fitted y matrix.
weights
: (Optional) The x-weights of the regression.
loadings
: The x-loadings of the latent variables.
scores
: (Optional) The x-scores of the latent variables.
y.loadings
: The y-loadings of the latent variables.
y.scores
: (Optional) The y-scores of the latent variables.
cvar
: (Optional) The covariance explained by each component.
Or, an object of class opls
, with the following components:
weights
: The orthogonal x-weights.
loadings
: The orthogonal x-loadings.
scores
: The orthogonal x-scores.
ratio
: The ratio of the orthogonal weights to the PLS loadings for each component. This provides a measure of how much orthogonal variation is being removed by each component and can be interpreted as a scree plot similar to PCA.
x
: The processed data matrix with orthogonal variation removed.
regressions
: (Optional.) The PLS 1-component regressions on the processed data.
Kylie A. Bemis
S. Wold, H. Martens, and H. Wold. “The multivariate calibration method in chemistry solved by the PLS method.” Proceedings on the Conference on Matrix Pencils, Lecture Notes in Mathematics, Heidelberg, Springer-Verlag, pp. 286 - 293, 1983.
S. de Jong. “SIMPLS: An alternative approach to partial least squares regression.” Chemometrics and Intelligent Laboratory Systems, vol. 18, issue 3, pp. 251 - 263, 1993.
B. S. Dayal and J. F. MacGregor. “Improved PLS algorithms.” Journal of Chemometrics, vol. 11, pp. 73 - 85, 1997.
M. Barker and W. Rayens. “Partial least squares for discrimination.” Journal of Chemometrics, vol. 17, pp. 166-173, 2003.
J. Trygg and S. Wold. “Orthogonal projections to latent structures.” Journal of Chemometrics, vol. 16, issue 3, pp. 119 - 128, 2002.
S. Wold, A. Johansson, and M. Cocchi. “PLS: Partial least squares projections to latent structures.” 3D QSAR in Drug Design: Theory, Methods and Applications, ESCOM Science Publishers: Leiden, pp. 523 - 550, 1993.
prcomp
register(SerialParam())
x <- cbind(
c(-2.18, 1.84, -0.48, 0.83),
c(-2.18, -0.16, 1.52, 0.83))
y <- as.matrix(c(2, 2, 0, -4))
pls_nipals(x, y, k=2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.