Description Usage Arguments Details Value Author(s) References Examples
Implementation of SPCA, using variable projection as an optimization strategy.
1 2 3 4 5 6 7 8 9 |
X |
a numeric matrix or data.frame which provides the data for the sparse principal components analysis. |
k |
optional, a number specifying the maximal rank. |
alpha |
Sparsity controlling parameter. Higher values means sparser components. |
beta |
Amount of ridge shrinkage to apply in order to improve conditioning. |
center |
a logical value indicating whether the variables should be shifted to be zero centered. |
max_iter |
maximum number of iterations to perform. |
tol |
stopping criteria for the convergence. |
Sparse principal component analysis is a specialized variant of PCA. Specifically, SPCA promotes sparsity in the modes, i.e., the sparse modes have only a few active (nonzero) coefficients, while the majority of coefficients are constrained to be zero. This approach leads to a improved localization and interpretability of the model compared to the global PCA modes obtained from traditional PCA. In addition, SPCA avoids overfitting in a high-dimensional data setting where the number of variables p is greater than the number of observations n.
Given an (n,p) data matrix X, SPCA attemps to minimize the following objective function:
minimize f(A,B) = 1/2⋅‖X - X⋅B⋅Aᵀ‖² + α⋅‖B‖₁ + 1/2⋅β‖B‖², subject to AᵀA = I.
where B is the sparse weight matrix and A is an orthonormal matrix. The principal components Z are formed as
Z = X * B
and the data can be approximately rotated back as
X = Z t(A)
The print method can be used to present the results in a nice format.
spcaRcpp
returns a list containing the following six components:
loadings |
the matrix of variable loadings. |
standard deviations |
the approximated standard deviations. |
eigenvalues |
the approximated eigenvalues. |
center |
the centering used. |
var |
the variance. |
scores |
the principal component scores. |
Boya Jiang
[1] N. B. Erichson, P. Zheng, K. Manohar, S. Brunton, J. N. Kutz, A. Y. Aravkin. "Sparse Principal Component Analysis via Variable Projection." SIAM Journal on Applied Mathematics 2020 80:2, 977-1002 (available at 'arXiv https://arxiv.org/abs/1804.00341).
[2] N. B. Erichson, P. Zheng, S. Aravkin, sparsepca, (2018), GitHub repository, https://github.com/erichson/spca.
1 2 3 4 5 6 7 8 9 10 11 12 | # Create artifical data
m <- 10000
V1 <- rnorm(m, -100, 200)
V2 <- rnorm(m, -100, 300)
V3 <- -0.1 * V1 + 0.1 * V2 + rnorm(m, 0, 100)
X <- cbind(V1, V1, V1, V1, V2, V2, V2, V2, V3, V3)
X <- X + matrix( rnorm( length(X), 0, 1 ), ncol = ncol(X), nrow = nrow(X) )
# Compute SPCA
out <- spcaRcpp(X, k=3, alpha=1e-4, beta=1e-4, center = TRUE)
print(out)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.