gpower: Compute Sparse PCA using GPower method
In plofknaapje/gpowerr: Sparse PCA using the GPower method

Description Usage Arguments Details Value References Examples

View source: R/gpower.R

\loadmathjax

GPower uses four different optimization procedures for the four combinations between \mjseqnl_0 and \mjseqnl_1 regularisation and single-unit or block computation. The function tries to find a weights matrix \mjseqnW \in R^n \times k which has the highest possible explained variance of the data matrix \mjseqnX \in R^p \times n under the regularisation constraints of the case. The matrix \mjseqnZ \in R^n \times k is used by some of the methods as an intermediate solution. Lambda is calculated by multiplying rho with the maximum possible value of lambda.

The objective function of the single unit case with \mjseqnl_1 regularisation is \mjsdeqn\hatw = \underset\| w \| = 1\textrmargmax\| Xw \| - \lambda \| w \|_1 For the single-unit case with the \mjseqnl_0 regularisation, the objective function is \mjsdeqn\hatw = \underset\| z \| = 1\textrmargmax\; \underset\| w \| = 1\textrmargmax (z^\top X w)^2 - \lambda \| w \|_0, where the results are squared before gamma is subtracted instead of after. In order to compute more than 1 component, the matrix \mjseqnX is adjusted after each new component.

For the block cases, the following functions are used. For the case with \mjseqnl_1 regularisation, \mjsdeqn\hatW = \undersetZ \in M^p_k\textrmargmax \sum_j=1^k \underset\| W_j \| = 1\textrmargmax \mu_j Z_j^\top XW_j - \lambda_j \| Z_j \| and for the \mjseqnl_0 regularisation case, \mjsdeqn\hatW = \undersetZ \in M^p_k\textrmargmax \sum_j=1^k \underset\| \hatW_j \| = 1\textrmargmax (\mu_j Z_i^\top X W_j)^2 - \lambda_j\| W_j \|_0

All of these functions are optimized using the generalized power approach as described in the paper by Journée et al. (2010).

gpower(
  data,
  k,
  rho,
  reg = c("l0", "l1"),
  center = c(TRUE, FALSE),
  block = c(TRUE, FALSE),
  mu = 1,
  iter_max = 1000,
  epsilon = 1e-04
)

`data`	Input matrix of size (p x n) with p < n.
`k`	Number of components, 0 < k < p.
`rho`	Relative sparsity weight factor of the optimization. Either a vector of floats of size k or float which will be repeated k times. 0 < rho < 1.
`reg`	regularisation type to use in the optimization. Either 'l0' or 'l1'. The default is 'l1' since it performed best in experiments.
`center`	Centers the data. Either TRUE or FALSE. The default is TRUE.
`block`	Optimization method. If FALSE, the components are calculated individually. If TRUE, all components are calculated at the same time. The default is FALSE.
`mu`	Mean to be applied to each component in the block. Either a vector of float of size k or a float which will be repeated k times. Only used if block is TRUE. The default is 1.
`iter_max`	Maximum iterations when adjusting components with gradient descent. The default is 1000.
`epsilon`	Epsilon of the gradient descent stopping function. The default is 1e-4.

Generalized power method for sparse principal component analysis. Implements the method developed by Journee et al. (2010) with a choice between a L1 and L0 regularisation and a column based and block approach.

List containing:

weights: The PCA components
scores: Scores of the components on data
a_approx: Reconstructed version of data using the components
prop_sparse: Proportion of sparsity of the components
exp_var: Explained ratio of variance of the components
centers: Centers of matrix data if center == TRUE

Journee, M., Nesterov, Y., Richtarik, P. and Sepulchre, R. (2010) Generalized Power Method for Sparse Principal Component Analysis. Journal of Machine Learning Research. 11, 517-553.

set.seed(360)
p <- 20
n <- 50
k <- 5
data <- matrix(stats::rnorm(p * n), nrow = p, ncol = n)
rho <- 0.1
# rho <- c(0.1, 0.2, 0.1, 0.2, 0.1)
mu <- 1
# mu <- c(1, 1.5, 0.5, 2, 1)

# Single unit with l1 regularisation
gpower(data, k, rho, 'l1', TRUE)

# Single unit with l0 regularisation
gpower(data, k, rho, 'l0', TRUE)

# Block with l1 regularisation
gpower(data, k, rho, 'l1', TRUE, TRUE, mu)

# Block with l0 regularisation
gpower(data, k, rho, 'l0', TRUE, TRUE, mu)