prcomp.HDF5Matrix: Principal Component Analysis of an HDF5Matrix

View source: R/S3_decompositions.R

prcomp.HDF5MatrixR Documentation

Principal Component Analysis of an HDF5Matrix

Description

Block-wise PCA entirely on disk, equivalent to prcomp(). Implements the same interface as stats::prcomp() but operates on data stored in an HDF5 file without loading it into RAM.

Usage

## S3 method for class 'HDF5Matrix'
prcomp(
  x,
  retx = TRUE,
  center = TRUE,
  scale. = FALSE,
  tol = NULL,
  rank. = NULL,
  ncomponents = 0L,
  k = 2L,
  q = 1L,
  method = "auto",
  rankthreshold = 0,
  svdgroup = "SVD/",
  overwrite = FALSE,
  threads = -1L,
  ...
)

Arguments

x

An HDF5Matrix object.

retx

Logical. If TRUE (default) return the individual coordinates (x slot). If FALSE the x slot is NULL in the returned object.

center

Logical. Subtract column means before PCA (default TRUE).

scale.

Logical. Divide by column SDs before PCA (default FALSE).

tol

Ignored (present for interface compatibility with prcomp()).

rank.

Ignored. Present for compatibility with stats::prcomp.

ncomponents

Integer. Number of PCs to compute (0 = all, default).

k

Number of local SVDs per incremental level (default 2).

q

Number of incremental levels (default 1).

method

Computation method: "auto" (default), "blocks", or "full".

rankthreshold

Numeric in [0, 0.1]. Rank approximation threshold.

svdgroup

HDF5 group for intermediate SVD storage (default "SVD/").

overwrite

Logical. Recompute even if PCA results exist (default FALSE).

threads

Integer. OpenMP threads (-1 = auto-detect).

...

Ignored (S3 compatibility).

Value

An object of class c("HDF5PCA", "list") with elements:

sdev

Numeric vector. Standard deviations of the PCs.

rotation

HDF5Matrix. Variable loadings (rotation matrix).

x

HDF5Matrix or NULL. Individual coordinates.

center

Logical. Whether columns were centered.

scale

Logical. Whether columns were scaled.

cumvar

Numeric vector. Cumulative variance explained (percent).

lambda

Numeric vector. Eigenvalues.

var.cos2

HDF5Matrix. Squared cosines for variables.

ind.cos2

HDF5Matrix. Squared cosines for individuals.

ind.contrib

HDF5Matrix. Contributions of individuals to PCs.

file

Character. Path to the HDF5 file with all results.

Examples


tmp <- tempfile(fileext = ".h5")
X   <- hdf5_create_matrix(tmp, "data/M", data = matrix(rnorm(1000), 100, 10))
pca <- prcomp(X, center = TRUE, scale. = FALSE)
cat("Variance explained (PC1-3):", pca$cumvar[1:3], "\n")
dim(pca$rotation)   # 10 x nPC
dim(pca$x)          # 100 x nPC
hdf5_close_all()
unlink(tmp)



BigDataStatMeth documentation built on May 15, 2026, 1:07 a.m.