gpower: Compute Sparse PCA using GPower method

Description Usage Arguments Details Value References Examples

View source: R/gpower.R

Description

\loadmathjax

GPower uses four different optimization procedures for the four combinations between \mjseqnl_0 and \mjseqnl_1 regularisation and single-unit or block computation. The function tries to find a weights matrix \mjseqnW \in R^n \times k which has the highest possible explained variance of the data matrix \mjseqnX \in R^p \times n under the regularisation constraints of the case. The matrix \mjseqnZ \in R^n \times k is used by some of the methods as an intermediate solution. Lambda is calculated by multiplying rho with the maximum possible value of lambda.

The objective function of the single unit case with \mjseqnl_1 regularisation is \mjsdeqn\hatw = \underset\| w \| = 1\textrmargmax\| Xw \| - \lambda \| w \|_1 For the single-unit case with the \mjseqnl_0 regularisation, the objective function is \mjsdeqn\hatw = \underset\| z \| = 1\textrmargmax\; \underset\| w \| = 1\textrmargmax (z^\top X w)^2 - \lambda \| w \|_0, where the results are squared before gamma is subtracted instead of after. In order to compute more than 1 component, the matrix \mjseqnX is adjusted after each new component.

For the block cases, the following functions are used. For the case with \mjseqnl_1 regularisation, \mjsdeqn\hatW = \undersetZ \in M^p_k\textrmargmax \sum_j=1^k \underset\| W_j \| = 1\textrmargmax \mu_j Z_j^\top XW_j - \lambda_j \| Z_j \| and for the \mjseqnl_0 regularisation case, \mjsdeqn\hatW = \undersetZ \in M^p_k\textrmargmax \sum_j=1^k \underset\| \hatW_j \| = 1\textrmargmax (\mu_j Z_i^\top X W_j)^2 - \lambda_j\| W_j \|_0

All of these functions are optimized using the generalized power approach as described in the paper by Journée et al. (2010).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
gpower(
  data,
  k,
  rho,
  reg = c("l0", "l1"),
  center = c(TRUE, FALSE),
  block = c(TRUE, FALSE),
  mu = 1,
  iter_max = 1000,
  epsilon = 1e-04
)

Arguments

data

Input matrix of size (p x n) with p < n.

k

Number of components, 0 < k < p.

rho

Relative sparsity weight factor of the optimization. Either a vector of floats of size k or float which will be repeated k times. 0 < rho < 1.

reg

regularisation type to use in the optimization. Either 'l0' or 'l1'. The default is 'l1' since it performed best in experiments.

center

Centers the data. Either TRUE or FALSE. The default is TRUE.

block

Optimization method. If FALSE, the components are calculated individually. If TRUE, all components are calculated at the same time. The default is FALSE.

mu

Mean to be applied to each component in the block. Either a vector of float of size k or a float which will be repeated k times. Only used if block is TRUE. The default is 1.

iter_max

Maximum iterations when adjusting components with gradient descent. The default is 1000.

epsilon

Epsilon of the gradient descent stopping function. The default is 1e-4.

Details

Generalized power method for sparse principal component analysis. Implements the method developed by Journee et al. (2010) with a choice between a L1 and L0 regularisation and a column based and block approach.

Value

List containing:

weights

The PCA components

scores

Scores of the components on data

a_approx

Reconstructed version of data using the components

prop_sparse

Proportion of sparsity of the components

exp_var

Explained ratio of variance of the components

centers

Centers of matrix data if center == TRUE

References

Journee, M., Nesterov, Y., Richtarik, P. and Sepulchre, R. (2010) Generalized Power Method for Sparse Principal Component Analysis. Journal of Machine Learning Research. 11, 517-553.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
set.seed(360)
p <- 20
n <- 50
k <- 5
data <- matrix(stats::rnorm(p * n), nrow = p, ncol = n)
rho <- 0.1
# rho <- c(0.1, 0.2, 0.1, 0.2, 0.1)
mu <- 1
# mu <- c(1, 1.5, 0.5, 2, 1)

# Single unit with l1 regularisation
gpower(data, k, rho, 'l1', TRUE)

# Single unit with l0 regularisation
gpower(data, k, rho, 'l0', TRUE)

# Block with l1 regularisation
gpower(data, k, rho, 'l1', TRUE, TRUE, mu)

# Block with l0 regularisation
gpower(data, k, rho, 'l0', TRUE, TRUE, mu)

plofknaapje/gpowerr documentation built on Dec. 22, 2021, 8:48 a.m.