README.md

R-CMD-check codecov

Rcpp Integration of SPCA via Variable Projection

Sparse principal componenet analysis is a specialized variant of PCA. Specifically, SPCA promotes sparsity in the modes, i.e., the sparse modes have only a few active (nonzero) coefficients, while the majority of coefficients are constrained to be zero. This approach leads to a improved localization and interpretability of the model compared to the global PCA modes obtained from traditional PCA. In addition, SPCA avoids overfitting in a high-dimensional data setting where the number of variables p is greater than the number of observations n.

This package provides SPCA routines in R/Rcpp:

Problem Formulation

Given a data matrix X with shape (n, p), SPCA attemps to minimize the following optimization problem:

minimize f(A,B) = 1/2⋅‖X - X⋅B⋅Aᵀ‖² + α⋅‖B‖₁ + 1/2⋅β‖B‖², subject to AᵀA = I.

The matrix B is the sparse weight (loadings) matrix and A is an orthonormal matrix.

Then, the principal components Z are then formed as

Z = X %*% B.

Specifically, the interface of the SPCA function is:

spca(X, k, alpha=1e-4, beta=1e-4, center=TRUE, max_iter=1000, tol=1e-4)

The description of the arguments is listed in the following:

A list with the following components is returned:

Installation

Install the developer version of sparsepca package via github

#install.packages("devtools")
library(devtools)
devtools::install_github("BoyaJiang/spcaRcpp")
library(spcaRcpp)

References



BoyaJiang/spcaRcpp documentation built on Dec. 17, 2021, 11:53 a.m.