Description Usage Arguments Details Value Author(s) References See Also Examples
Performs a sparse principal components analysis to perform variable selection by using singular value decomposition.
1 2 3 4 |
X |
a numeric matrix (or data frame) which provides the data for the sparse principal components analysis. |
ncomp |
integer, the number of components to keep. |
center |
a logical value indicating whether the variables should be shifted to be zero centered.
Alternatively, a vector of length equal the number of columns of |
scale |
a logical value indicating whether the variables should be scaled to have
unit variance before the analysis takes place. The default is |
max.iter |
integer, the maximum number of iterations to check convergence in each component. |
tol |
a positive real, the tolerance used in the iterative algorithm. |
keepX |
numeric vector of length ncomp, the number of variables to keep in loading vectors. By default all variables are kept in the model. See details. |
logratio |
one of ('none','CLR'). Specifies the log ratio transformation to deal with compositional values that may arise from specific normalisation in sequencing data. Default to 'none' |
multilevel |
sample information for multilevel decomposition for repeated measurements. |
The calculation employs singular value decomposition of the (centered and scaled) data matrix and LASSO to generate sparsity on the loading vectors.
scale= TRUE
is highly recommended as it will help obtaining orthogonal sparse loading vectors.
keepX
is the number of variables to keep in loading vectors. The difference between number of columns
of X
and keepX
is the degree of sparsity, which refers to the number of zeros in each loading vector.
Note that spca
does not apply to the data matrix with missing values. The biplot function for spca
is not available.
According to Filzmoser et al., a ILR log ratio transformation is more appropriate for PCA with compositional data. Both CLR and ILR are valid.
Logratio transform and multilevel analysis are performed sequentially as internal pre-processing step, through logratio.transfo
and withinVariation
respectively.
Logratio can only be applied if the data do not contain any 0 value (for count data, we thus advise the normalise raw data with a 1 offset). For ILR transformation and additional offset might be needed.
spca
returns a list with class "spca"
containing the following components:
ncomp |
the number of components to keep in the calculation. |
varX |
the adjusted cumulative percentage of variances explained. |
keepX |
the number of variables kept in each loading vector. |
iter |
the number of iterations needed to reach convergence for each component. |
rotation |
the matrix containing the sparse loading vectors. |
x |
the matrix containing the principal components. |
Kim-Anh LĂȘ Cao, Fangzhou Yao, Leigh Coonan
Shen, H. and Huang, J. Z. (2008). Sparse principal component analysis via regularized low rank matrix approximation. Journal of Multivariate Analysis 99, 1015-1034.
pca
and http://www.mixOmics.org for more details.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | data(liver.toxicity)
spca.rat <- spca(liver.toxicity$gene, ncomp = 3, keepX = rep(50, 3))
spca.rat
## variable representation
plotVar(spca.rat, cex = 0.5)
## Not run: plotVar(spca.rat,style="3d")
## samples representation
plotIndiv(spca.rat, ind.names = liver.toxicity$treatment[, 3],
group = as.numeric(liver.toxicity$treatment[, 3]))
## Not run: plotIndiv(spca.rat, cex = 0.01,
col = as.numeric(liver.toxicity$treatment[, 3]),style="3d")
## End(Not run)
# example with multilevel decomposition and CLR log ratio transformation
# ----------------
## Not run:
data("diverse.16S")
pca.res = pca(X = diverse.16S$data.TSS, ncomp = 5,
logratio = 'CLR', multilevel = diverse.16S$sample)
plot(pca.res)
plotIndiv(pca.res, ind.names = FALSE, group = diverse.16S$bodysite, title = '16S diverse data',
legend=TRUE)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.