sgPLSda: Sparse Group Sparse Partial Least Squares Discriminant...

View source: R/sgplsda.R

sgPLSdaR Documentation

Sparse Group Sparse Partial Least Squares Discriminant Analysis (sPLS-DA)

Description

Function to perform sparse group Partial Least Squares to classify samples (supervised analysis) and select variables.

Usage

sgPLSda(X, Y, ncomp = 2, keepX = rep(ncol(X), ncomp),
       max.iter = 500, tol = 1e-06, ind.block.x,
     alpha.x, upper.lambda = 10 ^ 5)

Arguments

X

numeric matrix of predictors. NAs are allowed.

Y

a factor or a class vector for the discrete outcome.

ncomp

the number of components to include in the model (see Details).

keepX

numeric vector of length ncomp, the number of variables to keep in X-loadings. By default all variables are kept in the model.

max.iter

integer, the maximum number of iterations.

tol

a positive real, the tolerance used in the iterative algorithm.

ind.block.x

a vector of integers describing the grouping of the X-variables. (see an example in Details section)

alpha.x

The mixing parameter (value between 0 and 1) related to the sparsity within group for the X dataset.

upper.lambda

By default upper.lambda=10 ^ 5. A large value specifying the upper bound of the intervall of lambda values for searching the value of the tuning parameter (lambda) corresponding to a non-zero group of variables.

Details

sgPLSda function fit sgPLS models with 1, \ldots ,ncomp components to the factor or class vector Y. The appropriate indicator (dummy) matrix is created.

ind.block.x <- c(3,10,15) means that X is structured into 4 groups: X1 to X3; X4 to X10, X11 to X15 and X16 to Xp where p is the number of variables in the X matrix.

Value

sPLSda returns an object of class "sPLSda", a list that contains the following components:

X

the centered and standardized original predictor matrix.

Y

the centered and standardized indicator response vector or matrix.

ind.mat

the indicator matrix.

ncomp

the number of components included in the model.

keepX

number of X variables kept in the model on each component.

mat.c

matrix of coefficients to be used internally by predict.

variates

list containing the variates.

loadings

list containing the estimated loadings for the X and Y variates.

names

list containing the names to be used for individuals and variables.

tol

the tolerance used in the iterative algorithm, used for subsequent S3 methods

max.iter

the maximum number of iterations, used for subsequent S3 methods

iter

Number of iterations of the algorthm for each component

ind.block.x

a vector of integers describing the grouping of the X variables.

alpha.x

The mixing parameter related to the sparsity within group for the X dataset.

upper.lambda

The upper bound of the intervall of lambda values for searching the value of the tuning parameter (lambda) corresponding to a non-zero group of variables.

Author(s)

Benoit Liquet and Pierre Lafaye de Micheaux.

References

Liquet Benoit, Lafaye de Micheaux Pierre , Hejblum Boris, Thiebaut Rodolphe (2016). A group and Sparse Group Partial Least Square approach applied in Genomics context. Bioinformatics.

On sPLS-DA: Le Cao, K.-A., Boitard, S. and Besse, P. (2011). Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12:253.

See Also

sPLS, summary, plotIndiv, plotVar, cim, network, predict, perf and http://www.mixOmics.org for more details.

Examples

data(simuData)
X <- simuData$X
Y <- simuData$Y
ind.block.x <- seq(100, 900, 100)
ind.block.x[2] <- 250
#To add some noise in the second group
model <- sgPLSda(X, Y, ncomp = 3,ind.block.x=ind.block.x, keepX = c(2, 2, 2)
, alpha.x = c(0.5,0.5,0.99))
result.sgPLSda <- select.sgpls(model)
result.sgPLSda$group.size.X
##perf(model,criterion="all",validation="loo") -> res
##res$error.rate	

sgPLS documentation built on Oct. 5, 2023, 5:06 p.m.