Description Usage Arguments Details Value Author(s) References See Also Examples

Performs a sparse principal components analysis to perform variable selection by using singular value decomposition.

1 2 3 4 5 6 7 8 9 10 11 |

`X` |
a numeric matrix (or data frame) which provides the data for the principal components analysis. It can contain missing values. |

`ncomp` |
Integer, if data is complete |

`center` |
(Default=TRUE) Logical, whether the variables should be
shifted to be zero centered. Alternatively, a vector of length equal the
number of columns of |

`scale` |
(Default=TRUE) Logical indicating whether the variables should be scaled to have unit variance before the analysis takes place. |

`keepX` |
numeric vector of length ncomp, the number of variables to keep in loading vectors. By default all variables are kept in the model. See details. |

`max.iter` |
Integer, the maximum number of iterations in the NIPALS algorithm. |

`tol` |
Positive real, the tolerance used in the NIPALS algorithm. |

`logratio` |
one of ('none','CLR'). Specifies the log ratio transformation to deal with compositional values that may arise from specific normalisation in sequencing data. Default to 'none' |

`multilevel` |
sample information for multilevel decomposition for repeated measurements. |

The calculation employs singular value decomposition of the (centered and scaled) data matrix and LASSO to generate sparsity on the loading vectors.

`scale= TRUE`

is highly recommended as it will help obtaining
orthogonal sparse loading vectors.

`keepX`

is the number of variables to keep in loading vectors. The
difference between number of columns of `X`

and `keepX`

is the
degree of sparsity, which refers to the number of zeros in each loading
vector.

Note that `spca`

does not apply to the data matrix with missing values.

According to Filzmoser et al., a ILR log ratio transformation is more appropriate for PCA with compositional data. Both CLR and ILR are valid.

Logratio transform and multilevel analysis are performed sequentially as
internal pre-processing step, through `logratio.transfo`

and
`withinVariation`

respectively.

Logratio can only be applied if the data do not contain any 0 value (for count data, we thus advise the normalise raw data with a 1 offset). For ILR transformation and additional offset might be needed.

It is important to note that since the derived components are not guaranteed to be uncorrelated, adjustment is performed for the (cumulative) explained variance of each component in the output.

`spca`

returns a list with class `"spca"`

containing the
following components:

- ncomp
the number of components to keep in the calculation.

- explained_variance
the adjusted percentage of variance explained for each component.

- cum.var
the adjusted cumulative percentage of variances explained.

- keepX
the number of variables kept in each loading vector.

- iter
the number of iterations needed to reach convergence for each component.

- rotation
the matrix containing the sparse loading vectors.

- x
the matrix containing the principal components.

Kim-Anh LĂȘ Cao, Fangzhou Yao, Leigh Coonan, Ignacio Gonzalez, Al J Abadi

Shen, H. and Huang, J. Z. (2008). Sparse principal component
analysis via regularized low rank matrix approximation. *Journal of
Multivariate Analysis* **99**, 1015-1034.

`pca`

and http://www.mixOmics.org for more details.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ```
data(liver.toxicity)
spca.rat <- spca(liver.toxicity$gene, ncomp = 3, keepX = rep(50, 3))
spca.rat
## variable representation
plotVar(spca.rat, cex = 1)
## Not run:
plotVar(spca.rat,style="3d")
## End(Not run)
## samples representation
plotIndiv(spca.rat, ind.names = liver.toxicity$treatment[, 3],
group = as.numeric(liver.toxicity$treatment[, 3]))
## Not run:
plotIndiv(spca.rat, cex = 0.01,
col = as.numeric(liver.toxicity$treatment[, 3]),style="3d")
## End(Not run)
## example with multilevel decomposition and CLR log ratio transformation
data("diverse.16S")
spca.res = spca(X = diverse.16S$data.TSS, ncomp = 5,
logratio = 'CLR', multilevel = diverse.16S$sample)
plot(spca.res)
plotIndiv(spca.res, ind.names = FALSE, group = diverse.16S$bodysite, title = '16S diverse data',
legend=TRUE)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.