pls | R Documentation |
pls
is used to calibrate, validate and use of partial least squares (PLS)
regression model.
pls(
x,
y,
ncomp = min(nrow(x) - 1, ncol(x), 20),
center = TRUE,
scale = FALSE,
cv = NULL,
exclcols = NULL,
exclrows = NULL,
x.test = NULL,
y.test = NULL,
method = "simpls",
info = "",
ncomp.selcrit = "min",
lim.type = "ddmoments",
alpha = 0.05,
gamma = 0.01,
cv.scope = "local"
)
x |
matrix with predictors. |
y |
matrix with responses. |
ncomp |
maximum number of components to calculate. |
center |
logical, center or not predictors and response values. |
scale |
logical, scale (standardize) or not predictors and response values. |
cv |
cross-validation settings (see details). |
exclcols |
columns of x to be excluded from calculations (numbers, names or vector with logical values) |
exclrows |
rows to be excluded from calculations (numbers, names or vector with logical values) |
x.test |
matrix with predictors for test set. |
y.test |
matrix with responses for test set. |
method |
algorithm for computing PLS model (only 'simpls' is supported so far) |
info |
short text with information about the model. |
ncomp.selcrit |
criterion for selecting optimal number of components ( |
lim.type |
which method to use for calculation of critical limits for residual distances (see details) |
alpha |
significance level for extreme limits for T2 and Q disances. |
gamma |
significance level for outlier limits for T2 and Q distances. |
cv.scope |
scope for center/scale operations inside CV loop: 'global' — using globally computed mean and std or 'local' — recompute new for each local calibration set. |
So far only SIMPLS method [1] is available. Implementation works both with one and multiple response variables.
Like in pca
, pls
uses number of components (ncomp
) as a minimum of
number of objects - 1, number of x variables and the default or provided value. Regression
coefficients, predictions and other results are calculated for each set of components from 1
to ncomp
: 1, 1:2, 1:3, etc. The optimal number of components, (ncomp.selected
),
is found using first local minumum, but can be also forced to user defined value using function
(selectCompNum.pls
). The selected optimal number of components is used for all
default operations - predictions, plots, etc.
Cross-validation settings, cv
, can be a number or a list. If cv
is a number, it
will be used as a number of segments for random cross-validation (if cv = 1
, full
cross-validation will be preformed). If it is a list, the following syntax can be used:
cv = list("rand", nseg, nrep)
for random repeated cross-validation with nseg
segments and nrep
repetitions or cv = list("ven", nseg)
for systematic splits
to nseg
segments ('venetian blinds').
Calculation of confidence intervals and p-values for regression coefficients can by done
based on Jack-Knifing resampling. This is done automatically if cross-validation is used.
However it is recommended to use at least 10 segments for stable JK result. See help for
regcoeffs
objects for more details.
Returns an object of pls
class with following fields:
ncomp |
number of components included to the model. |
ncomp.selected |
selected (optimal) number of components. |
xcenter |
vector with values used to center the predictors (x). |
ycenter |
vector with values used to center the responses (y). |
xscale |
vector with values used to scale the predictors (x). |
yscale |
vector with values used to scale the responses (y). |
xloadings |
matrix with loading values for x decomposition. |
yloadings |
matrix with loading values for y decomposition. |
xeigenvals |
vector with eigenvalues of components (variance of x-scores). |
yeigenvals |
vector with eigenvalues of components (variance of y-scores). |
weights |
matrix with PLS weights. |
coeffs |
object of class |
info |
information about the model, provided by user when build the model. |
cv |
information cross-validation method used (if any). |
res |
a list with result objects (e.g. calibration, cv, etc.) |
Sergey Kucheryavskiy (svkucheryavski@gmail.com)
1. S. de Jong, Chemometrics and Intelligent Laboratory Systems 18 (1993) 251-263. 2. Tarja Rajalahti et al. Chemometrics and Laboratory Systems, 95 (2009), 35-48. 3. Il-Gyo Chong, Chi-Hyuck Jun. Chemometrics and Laboratory Systems, 78 (2005), 103-112.
Main methods for pls
objects:
print | prints information about a pls object. |
summary.pls | shows performance statistics for the model. |
plot.pls | shows plot overview of the model. |
pls.simpls | implementation of SIMPLS algorithm. |
predict.pls | applies PLS model to a new data. |
selectCompNum.pls | set number of optimal components in the model. |
setDistanceLimits.pls | allows to change parameters for critical limits. |
categorize.pls | categorize data rows similar to
categorize.pca . |
selratio | computes matrix with selectivity ratio values. |
vipscores | computes matrix with VIP scores values. |
Plotting methods for pls
objects:
plotXScores.pls | shows scores plot for x decomposition. |
plotXYScores.pls | shows scores plot for x and y decomposition. |
plotXLoadings.pls | shows loadings plot for x decomposition. |
plotXYLoadings.pls | shows loadings plot for x and y decomposition. |
plotXVariance.pls | shows explained variance plot for x decomposition. |
plotYVariance.pls | shows explained variance plot for y decomposition. |
plotXCumVariance.pls | shows cumulative explained variance plot for y decomposition. |
plotYCumVariance.pls | shows cumulative explained variance plot for y decomposition. |
plotXResiduals.pls | shows distance/residuals plot for x decomposition. |
plotXYResiduals.pls | shows joint distance plot for x and y decomposition. |
plotWeights.pls | shows plot with weights. |
plotSelectivityRatio.pls | shows plot with selectivity ratio values. |
plotVIPScores.pls | shows plot with VIP scores values. |
Methods inherited from regmodel
object (parent class for pls
):
plotPredictions.regmodel | shows predicted vs. measured plot. |
plotRMSE.regmodel | shows RMSE plot. |
plotRMSERatio.regmodel | shows plot for ratio RMSECV/RMSEC values. |
plotYResiduals.regmodel | shows residuals plot for y values. |
getRegcoeffs.regmodel | returns matrix with regression coefficients. |
Most of the methods for plotting data (except loadings and regression coefficients) are also
available for PLS results (plsres
) objects. There is also a randomization test
for PLS-regression (randtest
) and implementation of interval PLS algorithm
for variable selection (ipls
)
### Examples of using PLS model class
library(mdatools)
## 1. Make a PLS model for concentration of first component
## using full-cross validation and automatic detection of
## optimal number of components and show an overview
data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]
model = pls(x, y, ncomp = 8, cv = 1)
summary(model)
plot(model)
## 2. Make a PLS model for concentration of first component
## using test set and 10 segment cross-validation and show overview
data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]
x.t = simdata$spectra.t
y.t = simdata$conc.t[, 1]
model = pls(x, y, ncomp = 8, cv = 10, x.test = x.t, y.test = y.t)
model = selectCompNum(model, 2)
summary(model)
plot(model)
## 3. Make a PLS model for concentration of first component
## using only test set validation and show overview
data(simdata)
x = simdata$spectra.c
y = simdata$conc.c[, 1]
x.t = simdata$spectra.t
y.t = simdata$conc.t[, 1]
model = pls(x, y, ncomp = 6, x.test = x.t, y.test = y.t)
model = selectCompNum(model, 2)
summary(model)
plot(model)
## 4. Show variance and error plots for a PLS model
par(mfrow = c(2, 2))
plotXCumVariance(model, type = 'h')
plotYCumVariance(model, type = 'b', show.labels = TRUE, legend.position = 'bottomright')
plotRMSE(model)
plotRMSE(model, type = 'h', show.labels = TRUE)
par(mfrow = c(1, 1))
## 5. Show scores plots for a PLS model
par(mfrow = c(2, 2))
plotXScores(model)
plotXScores(model, comp = c(1, 3), show.labels = TRUE)
plotXYScores(model)
plotXYScores(model, comp = 2, show.labels = TRUE)
par(mfrow = c(1, 1))
## 6. Show loadings and coefficients plots for a PLS model
par(mfrow = c(2, 2))
plotXLoadings(model)
plotXLoadings(model, comp = c(1, 2), type = 'l')
plotXYLoadings(model, comp = c(1, 2), legend.position = 'topleft')
plotRegcoeffs(model)
par(mfrow = c(1, 1))
## 7. Show predictions and residuals plots for a PLS model
par(mfrow = c(2, 2))
plotXResiduals(model, show.label = TRUE)
plotYResiduals(model, show.label = TRUE)
plotPredictions(model)
plotPredictions(model, ncomp = 4, xlab = 'C, reference', ylab = 'C, predictions')
par(mfrow = c(1, 1))
## 8. Selectivity ratio and VIP scores plots
par(mfrow = c(2, 2))
plotSelectivityRatio(model)
plotSelectivityRatio(model, ncomp = 1)
par(mfrow = c(1, 1))
## 9. Variable selection with selectivity ratio
selratio = getSelectivityRatio(model)
selvar = !(selratio < 8)
xsel = x[, selvar]
modelsel = pls(xsel, y, ncomp = 6, cv = 1)
modelsel = selectCompNum(modelsel, 3)
summary(model)
summary(modelsel)
## 10. Calculate average spectrum and show the selected variables
i = 1:ncol(x)
ms = apply(x, 2, mean)
par(mfrow = c(2, 2))
plot(i, ms, type = 'p', pch = 16, col = 'red', main = 'Original variables')
plotPredictions(model)
plot(i, ms, type = 'p', pch = 16, col = 'lightgray', main = 'Selected variables')
points(i[selvar], ms[selvar], col = 'red', pch = 16)
plotPredictions(modelsel)
par(mfrow = c(1, 1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.