plsda: PLSDA models

plsrdaR Documentation

PLSDA models

Description

Discrimination (DA) based on PLS.

The training variable y (univariate class membership) is firstly transformed to a dummy table containing nclas columns, where nclas is the number of classes present in y. Each column is a dummy variable (0/1). Then, a PLS2 is implemented on the X-data and the dummy table, returning latent variables (LVs) that are used as dependent variables in a DA model.

- plsrda: Usual "PLSDA". A linear regression model predicts the Y-dummy table from the PLS2 LVs. This corresponds to the PLSR2 of the X-data and the Y-dummy table. For a given observation, the final prediction is the class corresponding to the dummy variable for which the prediction is the highest.

- plslda and plsqda: Probabilistic LDA and QDA are run over the PLS2 LVs, respectively.

Usage


plsrda(X, y, weights = NULL, nlv, 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

plslda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

plsqda(X, y, weights = NULL, nlv, prior = c("unif", "prop"), 
Xscaling = c("none","pareto","sd")[1], Yscaling = c("none","pareto","sd")[1])

## S3 method for class 'Plsrda'
predict(object, X, ..., nlv = NULL) 

## S3 method for class 'Plsprobda'
predict(object, X, ..., nlv = NULL) 

Arguments

X

For the main functions: Training X-data (n, p). — For the auxiliary functions: New X-data (m, p) to consider.

y

Training class membership (n). Note: If y is a factor, it is replaced by a character vector.

weights

Weights (n) to apply to the training observations for the PLS2. Internally, weights are "normalized" to sum to 1. Default to NULL (weights are set to 1 / n).

nlv

The number(s) of LVs to calculate.

prior

The prior probabilities of the classes. Possible values are "unif" (default; probabilities are set equal for all the classes) or "prop" (probabilities are set equal to the observed proportions of the classes in y).

Xscaling

X variable scaling among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

Yscaling

Y variable scaling, once converted to binary variables, among "none" (mean-centering only), "pareto" (mean-centering and pareto scaling), "sd" (mean-centering and unit variance scaling). If "pareto" or "sd", uncorrected standard deviation is used.

object

For the auxiliary functions: A fitted model, output of a call to the main functions.

...

For the auxiliary functions: Optional arguments. Not used.

Value

For plsrda, plslda, plsqda:

fm

list with the model: (T): X-scores matrix; (P): X-loading matrix;(R): The PLS projection matrix (p,nlv); (W): X-loading weights matrix ;(C): The Y-loading weights matrix; (TT): the X-score normalization factor; (xmeans): the centering vector of X (p,1); (ymeans): the centering vector of Y (q,1); (xscales): the scaling vector of X (p,1); (yscales): the scaling vector of Y (q,1); (weights): vector of observation weights; (U): intermediate output.

lev

classes

ni

number of observations in each class

For predict.Plsrda, predict.Plsprobda:

pred

predicted class for each observation

posterior

calculated probability of belonging to a class for each observation

Note

The first example concerns PLSDA, and the second one concerns PLS LDA. fm are PLS1 models, and zfm are PLS2 models to predict the disjunctive matrix.

See Also

plsr_plsda_allsteps function to help determine the optimal number of latent variables, perform a permutation test, calculate model parameters and predict new observations.

Examples


## EXAMPLE OF PLSDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)

Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plsrda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
names(fm)

predict(fm, Xtest)
predict(fm, Xtest, nlv = 0:2)$pred

pred <- predict(fm, Xtest)$pred
err(pred, ytest)

zfm <- fm$fm
transform(zfm, Xtest)
transform(zfm, Xtest, nlv = 1)
summary(zfm, Xtrain)
coef(zfm)
coef(zfm, nlv = 0)
coef(zfm, nlv = 2)

## EXAMPLE OF PLS LDA

n <- 50 ; p <- 8
Xtrain <- matrix(rnorm(n * p), ncol = p)
ytrain <- sample(c(1, 4, 10), size = n, replace = TRUE)
Xtest <- Xtrain[1:5, ] ; ytest <- ytrain[1:5]

nlv <- 5
fm <- plslda(Xtrain, ytrain, Xscaling = "sd", nlv = nlv)
predict(fm, Xtest)
predict(fm, Xtest, nlv = 1:2)$pred

zfm <- fm$fm[[1]]
class(zfm)
names(zfm)
summary(zfm, Xtrain)
transform(zfm, Xtest[1:2, ])
coef(zfm)


rchemo documentation built on Sept. 11, 2024, 8:05 p.m.