spca: Sparse Principal Components Analysis

Description Usage Arguments Value Author(s) See Also Examples

Description

Performs a sparse principal components analysis to perform variable selection by using singular value decomposition.

The calculation employs singular value decomposition of the (centered and scaled) data matrix and LASSO to generate sparsity on the loading vectors.

scale= TRUE is highly recommended as it will help obtaining orthogonal sparse loading vectors.

keepX is the number of variables to keep in loading vectors. The difference between number of columns of X and keepX is the degree of sparsity, which refers to the number of zeros in each loading vector.

Note that spca does not apply to the data matrix with missing values. The biplot function for spca is not available.

According to Filzmoser et al., a ILR log ratio transformation is more appropriate for PCA with compositional data. Both CLR and ILR are valid.

Logratio transform and multilevel analysis are performed sequentially as internal pre-processing step, through logratio.transfo and withinVariation respectively.

Logratio can only be applied if the data do not contain any 0 value (for count data, we thus advise the normalise raw data with a 1 offset). For ILR transformation and additional offset might be needed.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{spca}{default}(X, assay=NULL, ncomp = 2, center = TRUE, scale = TRUE,
keepX = rep(ncol(X), ncomp), max.iter = 500, tol = 1e-06, logratio = 'none',
multilevel = NULL)

## S4 method for signature 'ANY'
X, ncomp = 2, center = TRUE, scale = TRUE,
keepX = repspca(ncol(X), ncomp),max.iter = 500, tol = 1e-06,
logratio = c('none','CLR'), multilevel = NULL)

## S4 method for signature 'MultiAssayExperiment'
spca(X, ncomp = 2, ..., assay = NULL)

Arguments

X

A numeric matrix (or data frame) which provides the data for the principal components analysis. It can contain missing values. Alternatively, a MultiAssayExperiment object.

ncomp

Integer, if data is complete ncomp decides the number of components and associated eigenvalues to display from the pcasvd algorithm and if the data has missing values, ncomp gives the number of components to keep to perform the reconstitution of the data using the NIPALS algorithm. If NULL, function sets ncomp = min(nrow(X), ncol(X)).

...

currently ignored.

assay

Name or index of an assay from X.

keepX

numeric vector of length ncomp, the number of variables to keep.

logratio

one of ('none','CLR'). Specifies the log ratio transformation to deal with compositional values that may arise from specific normalisation in sequencing data. Default to 'none'. in loading vectors. By default all variables are kept in the model. See details.

...

aguments passed to the generic.

Value

spca returns a list with class "spca" containing the following components:

ncomp

the number of components to keep in the calculation.

varX

the adjusted cumulative percentage of variances explained.

keepX

the number of variables kept in each loading vector.

iter

the number of iterations needed to reach convergence for each component.

rotation

the matrix containing the sparse loading vectors.

x

the matrix containing the principal components.

Author(s)

Ignacio Gonzalez, Kim-Anh LĂȘ Cao, Fangzhou Yao, Al J Abadi

See Also

pca, ipca, selectVar, plotIndiv, plotVar and http://www.mixOmics.org for more details.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
#' \dontrun{
## successful: TRUE

library(mixOmics.data)
spca.rat <- spca(liver.toxicity$gene, ncomp = 3, keepX = rep(50, 3))
spca.rat

## variable representation
plotVar(spca.rat, cex=2)

plotVar(spca.rat,style="3d")
# }

# example with MultiAssayExperiment class
# --------------------------------

spca.rat <- spca(liver.toxicity.mae, assay='gene', ncomp = 3, keepX = rep(50, 3))
spca.rat

# \dontrun{


## samples representation
plotIndiv(spca.rat, ind.names = liver.toxicity$treatment[, 3],
          group = as.numeric(liver.toxicity$treatment[, 3]))
plotIndiv(spca.rat, cex = 0.01,
          col = as.numeric(liver.toxicity$treatment[, 3]),style="3d")


# example with multilevel decomposition and CLR log ratio transformation
# ----------------

data("diverse.16S")
pca.res = pca(X = diverse.16S$data.TSS, ncomp = 5,
              logratio = 'CLR', multilevel = diverse.16S$sample)
plot(pca.res)
plotIndiv(pca.res, ind.names = FALSE, group = diverse.16S$bodysite, title = '16S diverse data',
          legend=TRUE)

  #' }

ajabadi/mixOmics2 documentation built on Aug. 9, 2019, 1:08 a.m.