generateBatchDataLogPoisson: Generate batch data

View source: R/generateBatchDataLogPoisson.R

generateBatchDataLogPoissonR Documentation

Generate batch data

Description

Generate data from K multivaraite normal or multivariate t distributions with additional noise from batches. Assumes independence across columns. In each column the parameters are randomly permuted for both the groups and batches.

Usage

generateBatchDataLogPoisson(
  N,
  P,
  group_rates,
  batch_rates,
  group_weights,
  batch_weights,
  frac_known = 0.2,
  permute_variables = TRUE,
  scale_data = FALSE
)

Arguments

N

The number of items (rows) to generate.

P

The number of columns in the generated dataset.

group_rates

A vector of the group rates for the classes within a column.

batch_rates

A vector of the batch rates for the classes within a column. This is used to create a variable which has the sum of the appropriate batch and class rate, it might be better interpreted as the batch effect on the observed rate.

group_weights

One of either a K x B matrix of the expected proportion of each batch in each group or a K-vector of the expected proportion of the entire dataset in each group.

batch_weights

A vector of the expected proportion of N in each batch.

frac_known

The number of items with known labels.

permute_variables

Logical indicating if group and batch means and standard deviations should be permuted in each column or not (defaults to “TRUE“).

scale_data

Logical indicating if data should be mean centred and standardised (defaults to “FALSE“).

Value

A list of 5 objects; the data generated from the groups with and without batch effects, the label indicating the generating group, the batch label and the vector indicating training versus test.


batchmix documentation built on May 29, 2024, 2:14 a.m.