Dpca: Distributed Principal Component Analysis (DPCA)

View source: R/Dpca.R

DpcaR Documentation

Distributed Principal Component Analysis (DPCA)

Description

Performs distributed PCA on a data matrix partitioned into subsets.

Usage

Dpca(data, K, nk)

Arguments

data

A numeric matrix or data frame containing the data, where rows are observations and columns are variables.

K

Integer, the number of subsets to partition the data into.

nk

Integer, the size of each subset (number of rows per subset).

Details

The function splits the input data matrix into K subsets of size nk each. The parameters n (number of rows) and p (number of columns) are automatically derived from the input data matrix as n = nrow(data) and p = ncol(data).

Value

A list containing:

  • MSEXp: Minimum squared reconstruction error.

  • MSEvp: MSE of eigenvectors.

  • MSESp: MSE of covariance matrix.

  • kopt: Optimal subset index.

Examples

K <- 20
nk <- 50
nr <- 10
p <- 8
n <- K * nk
d <- 6
data <- matrix(c(rnorm((n - nr) * p, 0, 1), rpois(nr * p, 100)), ncol = p)
Dpca(data = data, K = K, nk = nk)

FPCdpca documentation built on Jan. 21, 2026, 9:08 a.m.

Related to Dpca in FPCdpca...