sgPLSda: Sparse Group Sparse Partial Least Squares Discriminant...
In sgPLS: Sparse Group Partial Least Square Methods

sgPLSda

R Documentation

Sparse Group Sparse Partial Least Squares Discriminant Analysis (sPLS-DA)

Description

Function to perform sparse group Partial Least Squares to classify samples (supervised analysis) and select variables.

Usage

sgPLSda(X, Y, ncomp = 2, keepX = rep(ncol(X), ncomp),
       max.iter = 500, tol = 1e-06, ind.block.x,
     alpha.x, upper.lambda = 10 ^ 5)

Arguments

`X`	numeric matrix of predictors. `NA`s are allowed.
`Y`	a factor or a class vector for the discrete outcome.
`ncomp`	the number of components to include in the model (see Details).
`keepX`	numeric vector of length `ncomp`, the number of variables to keep in `X`-loadings. By default all variables are kept in the model.
`max.iter`	integer, the maximum number of iterations.
`tol`	a positive real, the tolerance used in the iterative algorithm.
`ind.block.x`	a vector of integers describing the grouping of the `X`-variables. (see an example in Details section)
`alpha.x`	The mixing parameter (value between 0 and 1) related to the sparsity within group for the `X` dataset.
`upper.lambda`	By default `upper.lambda=10 ^ 5`. A large value specifying the upper bound of the intervall of lambda values for searching the value of the tuning parameter (lambda) corresponding to a non-zero group of variables.

Details

sgPLSda function fit sgPLS models with 1, \ldots ,ncomp components to the factor or class vector Y. The appropriate indicator (dummy) matrix is created.

ind.block.x <- c(3,10,15) means that X is structured into 4 groups: X1 to X3; X4 to X10, X11 to X15 and X16 to Xp where p is the number of variables in the X matrix.

Value

sPLSda returns an object of class "sPLSda", a list that contains the following components:

`X`	the centered and standardized original predictor matrix.
`Y`	the centered and standardized indicator response vector or matrix.
`ind.mat`	the indicator matrix.
`ncomp`	the number of components included in the model.
`keepX`	number of `X` variables kept in the model on each component.
`mat.c`	matrix of coefficients to be used internally by `predict`.
`variates`	list containing the variates.
`loadings`	list containing the estimated loadings for the `X` and `Y` variates.
`names`	list containing the names to be used for individuals and variables.
`tol`	the tolerance used in the iterative algorithm, used for subsequent S3 methods
`max.iter`	the maximum number of iterations, used for subsequent S3 methods
`iter`	Number of iterations of the algorthm for each component
`ind.block.x`	a vector of integers describing the grouping of the X variables.
`alpha.x`	The mixing parameter related to the sparsity within group for the `X` dataset.
`upper.lambda`	The upper bound of the intervall of lambda values for searching the value of the tuning parameter (lambda) corresponding to a non-zero group of variables.

Author(s)

Benoit Liquet and Pierre Lafaye de Micheaux.

References

Liquet Benoit, Lafaye de Micheaux Pierre , Hejblum Boris, Thiebaut Rodolphe (2016). A group and Sparse Group Partial Least Square approach applied in Genomics context. Bioinformatics.

On sPLS-DA: Le Cao, K.-A., Boitard, S. and Besse, P. (2011). Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12:253.

Examples

data(simuData)
X <- simuData$X
Y <- simuData$Y
ind.block.x <- seq(100, 900, 100)
ind.block.x[2] <- 250
#To add some noise in the second group
model <- sgPLSda(X, Y, ncomp = 3,ind.block.x=ind.block.x, keepX = c(2, 2, 2)
, alpha.x = c(0.5,0.5,0.99))
result.sgPLSda <- select.sgpls(model)
result.sgPLSda$group.size.X
##perf(model,criterion="all",validation="loo") -> res
##res$error.rate

sgPLS documentation built on Nov. 9, 2025, 1:07 a.m.