spcaRcpp: Rcpp Integration for Sparse Principal Component Analysis...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/spcaRcpp.R

Description

Implementation of SPCA, using variable projection as an optimization strategy.

Usage

1
2
3
4
5
6
7
8
9
spcaRcpp(
  X,
  k = NULL,
  alpha = 1e-04,
  beta = 1e-04,
  center = TRUE,
  max_iter = 1000,
  tol = 1e-05
)

Arguments

X

a numeric matrix or data.frame which provides the data for the sparse principal components analysis.

k

optional, a number specifying the maximal rank.

alpha

Sparsity controlling parameter. Higher values means sparser components.

beta

Amount of ridge shrinkage to apply in order to improve conditioning.

center

a logical value indicating whether the variables should be shifted to be zero centered.

max_iter

maximum number of iterations to perform.

tol

stopping criteria for the convergence.

Details

Sparse principal component analysis is a specialized variant of PCA. Specifically, SPCA promotes sparsity in the modes, i.e., the sparse modes have only a few active (nonzero) coefficients, while the majority of coefficients are constrained to be zero. This approach leads to a improved localization and interpretability of the model compared to the global PCA modes obtained from traditional PCA. In addition, SPCA avoids overfitting in a high-dimensional data setting where the number of variables p is greater than the number of observations n.

Given an (n,p) data matrix X, SPCA attemps to minimize the following objective function:

minimize f(A,B) = 1/2⋅‖X - X⋅B⋅Aᵀ‖² + α⋅‖B‖₁ + 1/2⋅β‖B‖², subject to AᵀA = I.

where B is the sparse weight matrix and A is an orthonormal matrix. The principal components Z are formed as

Z = X * B

and the data can be approximately rotated back as

X = Z t(A)

The print method can be used to present the results in a nice format.

Value

spcaRcpp returns a list containing the following six components:

loadings

the matrix of variable loadings.

standard deviations

the approximated standard deviations.

eigenvalues

the approximated eigenvalues.

center

the centering used.

var

the variance.

scores

the principal component scores.

Author(s)

Boya Jiang

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Create artifical data
m <- 10000
V1 <- rnorm(m, -100, 200)
V2 <- rnorm(m, -100, 300)
V3 <- -0.1 * V1 + 0.1 * V2 + rnorm(m, 0, 100)

X <- cbind(V1, V1, V1, V1, V2, V2, V2, V2, V3, V3)
X <- X + matrix( rnorm( length(X), 0, 1 ), ncol = ncol(X), nrow = nrow(X) )

# Compute SPCA
out <- spcaRcpp(X, k=3, alpha=1e-4, beta=1e-4, center = TRUE)
print(out)

BoyaJiang/spcaRcpp documentation built on Dec. 17, 2021, 11:53 a.m.