msPCA: Compute Sparse Multi-Source Principal Components

View source: R/SRPCAMS.R

msPCAR Documentation

Compute Sparse Multi-Source Principal Components

Description

Estimates sparse principal components from multiple covariance or correlation matrices using an ADMM-based optimization routine (see Puchhammer, Wilms and Filzmoser, 2024).

Usage

msPCA(
  eta,
  gamma,
  COVS,
  k = ncol(COVS[[1]]),
  adjust_eta = TRUE,
  convergence_plot = FALSE,
  n_max = 200,
  rho = list(NA, TRUE, 100, 1),
  eps = c(1e-05, 1e-04, 0.1, 50),
  show_progress = FALSE
)

Arguments

eta

Numeric or numeric vector. Controls the overall sparsity level. If a single value is provided, it will be used directly. If a vector is given, the optimal value will be selected via internal model selection.

gamma

Numeric or numeric vector. Controls the distribution of sparsity across components. If a single value is provided, the optimal eta is selected automatically.

COVS

A list of covariance or correlation matrices (one per data source or group).

k

Integer. Number of principal components to compute. If not specified, all components are estimated.

adjust_eta

Logical. If TRUE (default), the sparsity parameter eta is adjusted based on the variance structure.

convergence_plot

Logical. If TRUE, a convergence diagnostic plot is displayed of either the residuals or the loading entries (default: FALSE).

n_max

Integer. Maximum number of ADMM iterations (default: 200).

rho

List of parameters controlling the ADMM penalty parameter rho, with the following elements:

1

Initial value for rho (default: NA).

2

Logical; whether to increase rho if convergence is not reached (default: TRUE).

3

Maximum value for rho (default: 100). You may need to increase this for high-dimensional problems.

4

Step size for increasing rho (default: 1).

eps

Numeric vector of tolerance parameters used in optimization. Includes:

1

Tolerance for soft-thresholding (default: 1e-5).

2

Tolerance for ADMM convergence (default: 1e-4).

3

Tolerance for convergence of the internal root-finding step (default: 1e-1).

4

Maximum number of iterations for the root finder (default: 50).

show_progress

Logical. Indicates whether progress bars should be displayed.

Value

An object of class "msPCA" containing the following elements:

PC Array of dimension p x k x N of loading vectors.
p Number of variables.
N Number of neighborhoods.
k Number of components.
COVS List of covariance matrices sorted by neighborhood.
gamma Sparsity distribution.
eta Amount of sparsity.
converged Logical, if ADMM converged with given specifications.
n_steps Number of steps used.
residuals Primary and secondary residuals.

References

Puchhammer, P., Wilms, I., & Filzmoser, P. (2024). Sparse outlier-robust PCA for multi-source data. *ArXiv Preprint*. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2407.16299")}

See Also

ssMRCD, plot.msPCA, summary.msPCA, biplot.msPCA, screeplot.msPCA, scores, align

Examples


C1 = diag(c(1.1, 0.9, 0.6, 0.5, 2))
C2 = matrix(runif(25, -1, 1), 5, 5)
C2 = t(C2) %*% C2
C3 = matrix(runif(25, -1, 1), 5, 5)
C3 = t(C3) %*% C3

pca1 = msPCA(eta = 1, gamma = 0.5, COVS = list(C1, C2, C3), k = 3,
             n_max = 100, rho = list(NA, TRUE, 100, 1), show_progress = FALSE)
summary(pca1)

pca2 = msPCA(eta = seq(0, 3, 0.25), gamma = 1, COVS = list(C1, C2, C3), k = 3,
             n_max = 100, rho = list(NA, TRUE, 100, 1), show_progress = FALSE)
summary(pca2)

ssMRCD documentation built on Nov. 5, 2025, 7:44 p.m.