plsda: Partial Least Squares Discriminant Analysis

View source: R/plsda.R

plsdaR Documentation

Partial Least Squares Discriminant Analysis

Description

plsda is used to calibrate, validate and use of partial least squares discrimination analysis (PLS-DA) model.

Usage

plsda(
  x,
  c,
  ncomp = min(nrow(x) - 1, ncol(x), 20),
  center = TRUE,
  scale = FALSE,
  cv = NULL,
  exclcols = NULL,
  exclrows = NULL,
  x.test = NULL,
  c.test = NULL,
  method = "simpls",
  lim.type = "ddmoments",
  alpha = 0.05,
  gamma = 0.01,
  info = "",
  ncomp.selcrit = "min",
  classname = NULL,
  cv.scope = "local"
)

Arguments

x

matrix with predictors.

c

vector with class membership (should be either a factor with class names/numbers in case of multiple classes or a vector with logical values in case of one class model).

ncomp

maximum number of components to calculate.

center

logical, center or not predictors and response values.

scale

logical, scale (standardize) or not predictors and response values.

cv

cross-validation settings (see details).

exclcols

columns of x to be excluded from calculations (numbers, names or vector with logical values)

exclrows

rows to be excluded from calculations (numbers, names or vector with logical values)

x.test

matrix with predictors for test set.

c.test

vector with reference class values for test set (same format as calibration values).

method

method for calculating PLS model.

lim.type

which method to use for calculation of critical limits for residual distances (see details)

alpha

significance level for extreme limits for T2 and Q disances.

gamma

significance level for outlier limits for T2 and Q distances.

info

short text with information about the model.

ncomp.selcrit

criterion for selecting optimal number of components ('min' for first local minimum of RMSECV and 'wold' for Wold's rule.)

classname

name (label) of class in case if PLS-DA is used for one-class discrimination model. In this case it is expected that parameter 'c' will be a vector with logical values.

cv.scope

scope for center/scale operations inside CV loop: 'global' — using globally computed mean and std or 'local' — recompute new for each local calibration set.

Details

The plsda class is based on pls with extra functions and plots covering classification functionality. All plots for pls can be used. E.g. of you want to see the real predicted values (y in PLS) instead of classes use plotPredictions.pls(model) instead of plotPredictions(model).

Cross-validation settings, cv, can be a number or a list. If cv is a number, it will be used as a number of segments for random cross-validation (if cv = 1, full cross-validation will be preformed). If it is a list, the following syntax can be used: cv = list('rand', nseg, nrep) for random repeated cross-validation with nseg segments and nrep repetitions or cv = list('ven', nseg) for systematic splits to nseg segments ('venetian blinds').

Calculation of confidence intervals and p-values for regression coefficients are available only by jack-knifing so far. See help for regcoeffs objects for details.

Value

Returns an object of plsda class with following fields (most inherited from class pls):

ncomp

number of components included to the model.

ncomp.selected

selected (optimal) number of components.

xloadings

matrix with loading values for x decomposition.

yloadings

matrix with loading values for y (c) decomposition.

weights

matrix with PLS weights.

coeffs

matrix with regression coefficients calculated for each component.

info

information about the model, provided by user when build the model.

calres

an object of class plsdares with PLS-DA results for a calibration data.

testres

an object of class plsdares with PLS-DA results for a test data, if it was provided.

cvres

an object of class plsdares with PLS-DA results for cross-validation, if this option was chosen.

Author(s)

Sergey Kucheryavskiy (svkucheryavski@gmail.com)

See Also

Specific methods for plsda class:

print.plsda prints information about a pls object.
summary.plsda shows performance statistics for the model.
plot.plsda shows plot overview of the model.
predict.plsda applies PLS-DA model to a new data.

Methods, inherited from classmodel class:

plotPredictions.classmodel shows plot with predicted values.
plotSensitivity.classmodel shows sensitivity plot.
plotSpecificity.classmodel shows specificity plot.
plotMisclassified.classmodel shows misclassified ratio plot.

See also methods for class pls.

Examples

### Examples for PLS-DA model class

library(mdatools)

## 1. Make a PLS-DA model with full cross-validation and show model overview

# make a calibration set from iris data (3 classes)
# use names of classes as class vector
x.cal = iris[seq(1, nrow(iris), 2), 1:4]
c.cal = iris[seq(1, nrow(iris), 2), 5]

model = plsda(x.cal, c.cal, ncomp = 3, cv = 1, info = 'IRIS data example')
model = selectCompNum(model, 1)

# show summary and basic model plots
# misclassification will be shown only for first class
summary(model)
plot(model)

# summary and model plots for second class
summary(model, nc = 2)
plot(model, nc = 2)

# summary and model plot for specific class and number of components
summary(model, nc = 3, ncomp = 3)
plot(model, nc = 3, ncomp = 3)

## 2. Show performance plots for a model
par(mfrow = c(2, 2))
plotSpecificity(model)
plotSensitivity(model)
plotMisclassified(model)
plotMisclassified(model, nc = 2)
par(mfrow = c(1, 1))

## 3. Show both class and y values predictions
par(mfrow = c(2, 2))
plotPredictions(model)
plotPredictions(model, res = "cal", ncomp = 2, nc = 2)
plotPredictions(structure(model, class = "regmodel"))
plotPredictions(structure(model, class = "regmodel"), ncomp = 2, ny = 2)
par(mfrow = c(1, 1))

## 4. All plots from ordinary PLS can be used, e.g.:
par(mfrow = c(2, 2))
plotXYScores(model)
plotYVariance(model)
plotXResiduals(model)
plotRegcoeffs(model, ny = 2)
par(mfrow = c(1, 1))


mdatools documentation built on Sept. 11, 2024, 7:59 p.m.