View source: R/SimulateMultiBlock.R
| SimulateMultiBlock | R Documentation |
Generate a synthetic MultiBlock dataset built from a known number of
orthogonal latent sources plus Gaussian noise. Useful for benchmarking and
testing ComDim functions.
SimulateMultiBlock(
n = 500L,
p = 2000L,
n_sources = 4L,
noise = 0.05,
n_blocks = 2L
)
n |
Number of samples. Default: |
p |
Total number of variables (split evenly across blocks). Must be
divisible by |
n_sources |
Number of orthogonal latent sources. Default: |
noise |
Fraction of total variance attributed to noise, in (0, 1).
Default: |
n_blocks |
Number of blocks to split the variables into. Default:
|
The dataset is constructed as follows:
n_sources score vectors (n \times \text{n\_sources}) are
drawn from a standard normal distribution and orthonormalised by QR
decomposition.
Loading vectors (\text{n\_sources} \times p) are built so that
each source loads primarily (SD = 1) on one equal-sized variable segment,
with small cross-loadings (SD = 0.10) on the remaining variables.
The true signal X = TP is computed.
Gaussian noise is added such that
\text{noise\_var} = \text{signal\_var} \times
\text{noise} / (1 - \text{noise}).
The p variables are split into n_blocks equal-width
blocks, each assembled as a named element of the returned
MultiBlock.
A MultiBlock object with n_blocks blocks, each
of size n \times (p / \text{n\_blocks}), named "Block1",
"Block2", etc.
ComDim_PCA, MultiBlock
mb <- SimulateMultiBlock(n = 100, p = 200, n_sources = 4,
noise = 0.05, n_blocks = 2)
mb <- NormalizeMultiBlock(mb, method = 'norm')
res <- ComDim_PCA(mb, ndim = 4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.