ComDim_PCA: ComDim_PCA

View source: R/ComDim_PCA.R

ComDim_PCAR Documentation

ComDim_PCA

Description

Finding common dimensions in multi-block datasets.

Usage

ComDim_PCA(
  MB = MB,
  ndim = NULL,
  normalise = FALSE,
  threshold = 1e-10,
  loquace = FALSE,
  CompMethod = "Normal",
  Partitions = 1
)

Arguments

MB

A MultiBlock object.

ndim

Number of Common Dimensions.

normalise

To apply normalisation. FALSE == no (default), TRUE == yes.

threshold

The threshold limit to stop the iterations. If the "difference of fit" < threshold (1e-10 as default).

loquace

To display the calculation times. TRUE == yes, FALSE == no (default).

CompMethod

To speed-up the analysis for really big MultiBlocks. 'Normal' (default), 'Kernel', 'PCT', 'Tall' or 'Wide'.

Partitions

To speed-up the analysis for really big MultiBlocks. This parameter is used if CompMethod is 'Tall' or 'Wide'.

Value

A ComDim object. Slots for supervised analysis (R2Y, Q2, DQ2, VIP, VIP.block, PLS.model, cv, Prediction) are empty. The populated slots are:

Method

"PCA".

ndim

Number of Common Dimensions extracted.

Q.scores

Global scores matrix (n \times ndim). Column names are CC1, CC2, etc.; row names are sample names. Each column \mathbf{q}_a is a unit-norm consensus score, the dominant left singular vector of the salience-weighted concatenated blocks \mathbf{W} = [\sqrt{\lambda_1}\mathbf{X}_1 \mid \cdots \mid \sqrt{\lambda_B}\mathbf{X}_B].

T.scores

Named list of block-specific local scores matrices (n \times ndim each). For block b and component a: local loading \mathbf{p}_{ba} = \mathbf{X}_b'\mathbf{q}_a and local score \mathbf{t}_{ba} = \mathbf{X}_b\,\mathbf{p}_{ba}(\mathbf{p}_{ba}'\mathbf{p}_{ba})^{-1}.

P.loadings

Global loadings matrix (p_{tot} \times ndim). Column a is \mathbf{P}_a = \mathbf{X}'\mathbf{q}_a, where \mathbf{X} is the mean-centred (and optionally normalised) concatenated blocks.

Saliences

Block salience (weight) matrix (ntable \times ndim, row names = block names). Entry (b,a) is \lambda_{ba} = \mathbf{q}_a'\mathbf{X}_b\mathbf{X}_b'\mathbf{q}_a, the variance of block b captured by global score a.

R2X

Proportion of multi-block inertia captured by each component (named vector, length ndim). Let d_a be the leading singular value of \mathbf{W} for component a (stored as Singular_a = d_a^2); then

R2X_a = Singular_a^2 \big/ \sum_k Singular_k^2 = d_a^4 \big/ \sum_k d_k^4.

Singular

Squared leading singular values of \mathbf{W}, one per component: Singular_a = d_a^2.

Mean

List with MeanMB: named list of column-mean vectors per block, used for mean-centring.

Norm

List with NormMB: Frobenius norms used for block normalisation (all ones when normalise = FALSE).

variable.block

Character vector (length p_{tot}) indicating the block name of each row in P.loadings.

runtime

Total computation time in seconds.

References

Jouan-Rimbaud Bouveresse D, Rutledge DN (2024). A synthetic review of some recent extensions of ComDim. Journal of Chemometrics, 38(5), e3454. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1002/cem.3454")}

Original MATLAB implementation: https://github.com/DNRutledge/ComDim/

Examples

# Example 1: two data blocks.
b1 <- matrix(rnorm(500), 10, 50) # 10 samples, 50 variables
b2 <- matrix(rnorm(800), 10, 80) # 10 samples, 80 variables
mb <- MultiBlock(Data = list(b1 = b1, b2 = b2))
results <- ComDim_PCA(mb, 2)

# Example 2: two data blocks, each with different replicate number
b1 <- matrix(rnorm(500), 10, 50)
batch_b1 <- rep(1, 10)
b2 <- matrix(rnorm(2400), 30, 80)
batch_b2 <- c(rep(1, 10), rep(2, 10), rep(3, 10))
mb <- MultiBlock(
  Samples = list(
    b1 = paste0("samples_", 1:10),
    b2 = rep(paste0("samples_", 1:10), 3)
  ),
  Data = list(b1 = b1, b2 = b2),
  Batch = list(b1 = batch_b1, b2 = batch_b2),
  ignore.size = TRUE
)
rw <- SplitRW(mb)
results <- ComDim_PCA(rw, 2)

R.ComDim documentation built on May 13, 2026, 9:07 a.m.