simulate_block_data: Simulate correlated blocks of variables

View source: R/simulate_block.R

simulate_block_dataR Documentation

Simulate correlated blocks of variables

Description

simulate_block_data() creates a dataset of blocks of data where variables within each block are correlated. The correlation for each pair of variables is sampled uniformly from lower_corr to upper_corr, and the values of each are sampled using MASS::mvrnorm().

Usage

simulate_block_data(
  block_sizes,
  lower_corr,
  upper_corr,
  n,
  block_name = "block",
  sep = "_",
  var_name = "x"
)

Arguments

block_sizes

a vector of block sizes. The size of each block is the number of variables within it.

lower_corr

the lower bound of the correlation within each block

upper_corr

the upper bound of the correlation within each block

n

the number of observations or rows

block_name

description prepended to the variable to indicate the block it belongs to

sep

a character, what to separate the variable names with

var_name

the name of the variable within the block

Value

a tibble with sum(block_sizes) columns and n rows.

Examples

# create a 100 x 15 data set with 3 blocks
simulate_block_data(
  block_sizes = rep(5, 3),
  lower_corr = .4,
  upper_corr = .6,
  n = 100
)

USCbiostats/partition documentation built on Feb. 3, 2024, 3:38 a.m.