groupsparsePCA: Group-sparse PCA

View source: R/groupsparsePCA.R

groupsparsePCAR Documentation

Group-sparse PCA

Description

This function implements group-sparse principal component analysis (PCA) using a block optimisation algorithm or an iterative deflation algorithm. It generalizes the block optimisation approach of Journee et al. (2010) to group sparsity.

Usage

groupsparsePCA(
  A,
  m,
  lambda,
  index = 1:ncol(A),
  block = 1,
  mu = 1/1:m,
  groupsize = FALSE,
  center = TRUE,
  scale = TRUE,
  init = NULL
)

Arguments

A

a numerical data matrix of size n by p (observations by variables)

m

number of components

lambda

a numerical vector of size m providing reduced sparsity parameters (in relative value with respect to the theoretical upper bound). Each reduced sparsity parameter is a value between 0 and 1

index

a vector of integers of size p giving the group membership of each variable. By default, index=1:ncol(A) corresponds to one variable in each group

block

either 0 or 1. block==0 means that deflation is used if more than one component. A block optimisation algorithm is otherwise used that computes m components at once. By default, block=1

mu

numerical vector of size m with the mu parameters (required for the block algorithms only). By default, mu_j=1/j

groupsize

a logical value indicating wheter the size of the groups should be taken into account. By default, groupsize=FALSE

center

a logical value indicating whether the variables should be shifted to be zero centered

scale

a logical value indicating whether the variables should be scaled to have unit variance

init

a matrix of size p by m to initialize the loadings matrix in the block optimisation algorithm.

Details

This function implements an optimal projected variance sparse block PCA algorithm and a deflation algorithm applying the block algorithm with one single component iteratively to each deflated data matrix.

The block algorithm uses a numerical vector of parameters mu usually chosen either striclty decreasing (mu_j=1/j) or all equal (mu_j=1 for all j). Striclty decreasing parameters relieve the underdetermination which happens in some situations and drives to a solution close to the PCA solution.

The principal components are defined by Y=BZ where B is the centered (if center=TRUE) and scaled (if scale=TRUE) data matrix and where Z is the group-sparse loading matrix.

Value

Z

a p by m numerical matrix with the m group-sparse loading vectors

Y

a n by m numerical matrix with the m principal components

B

the numerical data matrix centered (if center=TRUE) and scaled (if scale=TRUE)

coef

a numerical vector of size p with the coefficients to predict principal component scores of new observations

References

M. Chavent and G. Chavent, Optimal projected variance group-sparse block PCA, submitted, 2020.

M. Journee, Y. Nesterov, P. Richtarik, and R. Sepulchre. Generalized power method for sparse principal component analysis. Journal of Machine Learning Research, 11:517-553, 2010.

See Also

sparsePCA, pev, explainedVar

Examples

# Simulated data
 v1 <- c(1,1,1,1,0,0,0,0,0.9,0.9)
 v2 <- c(0,0,0,0,1,1,1,1,-0.3,0.3)
 valp <- c(200,100,50,50,6,5,4,3,2,1)
 A <- simuPCA(50,cbind(v1,v2),valp,seed=1)
 # Three group-sparse PCA algorithms
 index <- rep(c(1,2,3),c(4,4,2)) 
 Z <- groupsparsePCA(A,2,c(0.5,0.5),index,block=0)$Z #deflation
 Z <- groupsparsePCA(A,2,c(0.5,0.5),index)$Z # block different mu
 Z <- groupsparsePCA(A,2,c(0.5,0.5),index,mu=c(1,1))$Z # block same mu
 

chavent/sparsePCA documentation built on Feb. 2, 2023, 1:12 p.m.