cdm.pca: High Dimensional Principal Component Analysis
In bdsvd: Block Structure Detection Using Singular Vectors

View source: R/ispca.R

cdm.pca

R Documentation

High Dimensional Principal Component Analysis

Description

Performs a principal component analysis on the given data matrix using the methods of Yata and Aoshima (2009, 2010).

Usage

cdm.pca(X, K = 1, method = "CDM", scale = TRUE, orthogonal = FALSE)

Arguments

`X`	Data matrix of dimension `n`x`p` with possibly `p >> n`.
`K`	Number of principal components to be computed. If `K` is larger than the number of variables `p` contained in the data matrix, `K = p - 1` loadings are computed.
`method`	Which method should be used to calculate the eigenvectors (loadings) and eigenvalues. `method = "DM"` uses the method by Yata and Aoshima (2009) and `method = "CDM"` uses the method by Yata and Aoshima (2010).
`scale`	Should the variables be scaled to have unit variance before the analysis takes place. Default is `TRUE`.
`orthogonal`	The estimated eigenvectors (loadings) computed using `method = "CDM"` (Yata and Aoshima, 2010) are orthogonal in the limit thus only approximately orthogonal in the finite sample case. Should the loadings be orthogonalized. Default is `FALSE`.

Details

This function performs principal component analysis using either the DM approach as described in Yata, K., Aoshima, M. (2009), or the CDM approach (Yata, K., Aoshima, M., 2010) Note that there is also a code implementation of CDM available at 'Aoshima Lab' (https://github.com/Aoshima-Lab/HDLSS-Tools/tree/main/CDM) provided by Makoto Aoshima.

Value

A list with the following components:

`v`	The first `K` estimated sparse singular vectors (loadings) if the data matrix `X`. The eigenvectors are orthogonalized if `orthogonal = TRUE`.
`l`	The corresponding first estimated eigenvalues of the identified block diagonal covariance matrix.
`K`	The number of sparse singular vectors (loadings) that have been computed.

References

Yata, K., Aoshima, M. (2009). PCA consistency for non-Gaussian data in high dimension, low sample size contex, Commun. Stat. - Theory Methods 38, 2634–2652.

Yata, K., Aoshima, M. (2010). Effective PCA for high-dimension, low-sample-size data with singular value decomposition of cross data matrix, J. Multivar. Anal. 101, 2060–2077.

Examples

#Example: run IS-PCA on a gene expression data set with two tissue types

if (requireNamespace("dslabs", quietly = TRUE)) {
data("tissue_gene_expression", package = "dslabs")

#We only select the two tissue types kidney (6) and liver (7)
Y <- as.numeric(tissue_gene_expression$y)
X <- scale(tissue_gene_expression$x[Y %in% c(6, 7), ], scale = FALSE)
Y <- Y[Y %in% c(6, 7)]

# Run PCA
pca.obj <- cdm.pca(X, K = 2)
PC <- X %*% pca.obj$v

# Plot the first two principal components
plot(PC, pch = Y-5, xlab = "PC1", ylab = "PC2")
}

bdsvd documentation built on March 26, 2026, 5:10 p.m.