SimulateMultiBlock: SimulateMultiBlock

View source: R/SimulateMultiBlock.R

SimulateMultiBlockR Documentation

SimulateMultiBlock

Description

Generate a synthetic MultiBlock dataset built from a known number of orthogonal latent sources plus Gaussian noise. Useful for benchmarking and testing ComDim functions.

Usage

SimulateMultiBlock(
  n = 500L,
  p = 2000L,
  n_sources = 4L,
  noise = 0.05,
  n_blocks = 2L
)

Arguments

n

Number of samples. Default: 500.

p

Total number of variables (split evenly across blocks). Must be divisible by n_blocks and by n_sources. Default: 2000.

n_sources

Number of orthogonal latent sources. Default: 4.

noise

Fraction of total variance attributed to noise, in (0, 1). Default: 0.05 (5 % noise).

n_blocks

Number of blocks to split the variables into. Default: 2.

Details

The dataset is constructed as follows:

  1. n_sources score vectors (n \times \text{n\_sources}) are drawn from a standard normal distribution and orthonormalised by QR decomposition.

  2. Loading vectors (\text{n\_sources} \times p) are built so that each source loads primarily (SD = 1) on one equal-sized variable segment, with small cross-loadings (SD = 0.10) on the remaining variables.

  3. The true signal X = TP is computed.

  4. Gaussian noise is added such that \text{noise\_var} = \text{signal\_var} \times \text{noise} / (1 - \text{noise}).

  5. The p variables are split into n_blocks equal-width blocks, each assembled as a named element of the returned MultiBlock.

Value

A MultiBlock object with n_blocks blocks, each of size n \times (p / \text{n\_blocks}), named "Block1", "Block2", etc.

See Also

ComDim_PCA, MultiBlock

Examples

mb <- SimulateMultiBlock(n = 100, p = 200, n_sources = 4,
                         noise = 0.05, n_blocks = 2)
mb <- NormalizeMultiBlock(mb, method = 'norm')
res <- ComDim_PCA(mb, ndim = 4)

R.ComDim documentation built on May 13, 2026, 9:07 a.m.