sparsePCA: Sparse PCA

View source: R/sparsePCA.R

sparsePCAR Documentation

Sparse PCA

Description

This function performs sparse principal component analysis (PCA) using a block optimisation algorithm or an iterative deflation algorithm.

Usage

sparsePCA(
  A,
  m,
  lambda,
  block = 1,
  mu = 1/1:m,
  center = TRUE,
  scale = TRUE,
  iter_max = 1000,
  epsilon = 1e-04
)

Arguments

A

a numerical data matrix of size n by p (observations by variables)

m

the number of components

lambda

a numerical vector of size m providing the reduced sparsity parameters (in relative value with respect to the theoretical upper bound). Each reduced sparsity parameter is a value between 0 and 1.

block

either 0 or 1. block==0 means that deflation is used if more than one component. A block optimisation algorithm is otherwise used that computes m components at once. By default, block=1.

mu

numerical vector of size m with the mu parameters (required for the block algorithms only). By default, mu_j=1/j.

center

a logical value indicating whether the variables should be shifted to be zero centered.

scale

a logical value indicating whether the variables should be scaled to have unit variance.

iter_max

maximum number of admissible iterations.

epsilon

accuracy of the stopping criterion.

Details

This function implements an optimal projected variance sparse block PCA algorithm and a deflation algorithm applying the block algorithm with one single component iteratively to each deflated data matrix.

The block algorithm uses a numerical vector of parameters mu usually chosen either striclty decreasing (mu_j=1/j) or all equal (mu_j=1 for all j). Striclty decreasing parameters relieve the underdetermination which happens in some situations and drives to a solution close to the PCA solution.

The principal components are defined by Y=BZ where B is the centered (if center=TRUE) and scaled (if scale=TRUE) data matrix and where Z is the sparse loading matrix.

Value

Z

a p by m numerical matrix with the m sparse loading vectors

Y

a n by m numerical matrix with the m principal components

B

the numerical data matrix centered (if center=TRUE) and scaled (if scale=TRUE)

References

M. Chavent and G. Chavent, Optimal projected variance group-sparse block PCA, submitted, 2020.

M. Journee, Y. Nesterov, P. Richtarik, and R. Sepulchre. Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11:517-553, 2010.

See Also

groupsparsePCA, pev, explainedVar

Examples

# Simulated data
 v1 <- c(1,1,1,1,0,0,0,0,0.9,0.9)
 v2 <- c(0,0,0,0,1,1,1,1,-0.3,0.3)
 valp <- c(200,100,50,50,6,5,4,3,2,1)
 A <- simuPCA(50,cbind(v1,v2),valp,seed=1)
 
 # Three sparse PCA algorithms
 Z <- sparsePCA(A,2,c(0.5,0.5),block=0)$Z #deflation
 Z <- sparsePCA(A,2,c(0.5,0.5),block=1)$Z #block different mu
 Z <- sparsePCA(A,2,c(0.5,0.5),block=1,mu=c(1,1))$Z #block same mu
 
 # Example of the protein data
 data("protein")
 Z <- sparsePCA(protein,2,c(0.5,0.5))$Z #block different mu
 


chavent/sparsePCA documentation built on Feb. 2, 2023, 1:12 p.m.