gPLSda: Group Sparse Partial Least Squares Discriminant Analysis... In sgPLS: Sparse Group Partial Least Square Methods

Description

Function to perform group Partial Least Squares to classify samples (supervised analysis) and select variables.

Usage

 ```1 2``` ```gPLSda(X, Y, ncomp = 2, keepX = rep(ncol(X), ncomp), max.iter = 500, tol = 1e-06, ind.block.x) ```

Arguments

 `X` numeric matrix of predictors. `NA`s are allowed. `Y` a factor or a class vector for the discrete outcome. `ncomp` the number of components to include in the model (see Details). `keepX` numeric vector of length `ncomp`, the number of variables to keep in X-loadings. By default all variables are kept in the model. `max.iter` integer, the maximum number of iterations. `tol` a positive real, the tolerance used in the iterative algorithm. `ind.block.x` a vector of integers describing the grouping of the X-variables. (see an example in Details section)

Details

`gPLSda` function fit gPLS models with 1, … ,`ncomp` components to the factor or class vector `Y`. The appropriate indicator (dummy) matrix is created.

`ind.block.x <- c(3,10,15)` means that X is structured into 4 groups: X1 to X3; X4 to X10, X11 to X15 and X16 to Xp where p is the number of variables in the X matrix.

Value

`sPLSda` returns an object of class `"sPLSda"`, a list that contains the following components:

 `X` the centered and standardized original predictor matrix. `Y` the centered and standardized indicator response vector or matrix. `ind.mat` the indicator matrix. `ncomp` the number of components included in the model. `keepX` number of X variables kept in the model on each component. `mat.c` matrix of coefficients to be used internally by `predict`. `variates` list containing the variates. `loadings` list containing the estimated loadings for the `X` and `Y` variates. `names` list containing the names to be used for individuals and variables. `tol` the tolerance used in the iterative algorithm, used for subsequent S3 methods `max.iter` the maximum number of iterations, used for subsequent S3 methods `iter` Number of iterations of the algorthm for each component `ind.block.x` a vector of integers describing the grouping of the X variables.

Author(s)

Benoit Liquet and Pierre Lafaye de Micheaux.

References

Liquet Benoit, Lafaye de Micheaux Pierre , Hejblum Boris, Thiebaut Rodolphe (2016). A group and Sparse Group Partial Least Square approach applied in Genomics context. Bioinformatics.

On sPLS-DA: Le Cao, K.-A., Boitard, S. and Besse, P. (2011). Sparse PLS Discriminant Analysis: biologically relevant feature selection and graphical displays for multiclass problems. BMC Bioinformatics 12:253.

`sPLS`, `summary`, `plotIndiv`, `plotVar`, `cim`, `network`, `predict`, `perf` and http://www.mixOmics.org for more details.
 ``` 1 2 3 4 5 6 7 8 9 10 11``` ```data(simuData) X <- simuData\$X Y <- simuData\$Y ind.block.x <- seq(100, 900, 100) model <- gPLSda(X, Y, ncomp = 3,ind.block.x=ind.block.x, keepX = c(2, 2, 2)) result.gPLSda <- select.sgpls(model) result.gPLSda\$group.size.X # perf(model,criterion="all",validation="loo") -> res # res\$error.rate ```