simulateMPRA: Simulate an MPRA dataset

Description Usage Arguments Details Value Examples

View source: R/simulateMPRA.R

Description

Simulate an MPRA dataset

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
simulateMPRA(
  tr = rep(2, 100),
  da = NULL,
  dna.noise.sd = 0.2,
  rna.noise.sd = 0.3,
  dna.inter = 5,
  dna.inter.sd = 0.5,
  nbc = 100,
  coef.bc.sd = 0.5,
  nbatch = 3,
  coef.batch.sd = 0.5
)

Arguments

tr

a vector of the true transcription rates, in log scale. The length of the vector determines the number of enhancers included in the dataset. Default is 100 enhancers of identical transcription rate of 2.

da

a vector determinig differential activity. Values are assumed to be in log scale, and will be used in the model as log Fold-Change values. If NULL (default) a single condition is simulated.

dna.noise.sd

level of noise to add to the DNA library

rna.noise.sd

level of noise to add to the RNA library

dna.inter

the baseline DNA levels (intercept term), controlling the true mean abundance of plasmids

dna.inter.sd

the true variation of the plasmid levels

nbc

number of unique barcode to include per enhancer

coef.bc.sd

true variation between barcodes

nbatch

number of batches to simulate

coef.batch.sd

the level of true variation that distinguishes batches (the size of the batch effects)

Details

the data is generated by using the same nested-GLM construct that MPRAnalyzes uses, with non-strandard log-normal noise models (whereas by default MPRAnalyze uses a Gamma-Poisson model). The data generated can have multiple batches, and either 1 or 2 conditions, and the simulated data is always paired (DNA and RNA extracted from the same library). User can control both true and observed variation levels (noise), the number of expected plasmids per barcode, the true transcription ratio, the size of the batch and barcode effects.

Value

a list:

Examples

1
2
3
4
5
6
7
8
# single condition
data <- simulateMPRA()
# two conditions
data <- simulateMPRA(da=c(rep(-0.5, 50), rep(0.5, 50)))
# more observed noise
data <- simulateMPRA(dna.noise.sd = 0.75, rna.noise.sd = 0.75)
# gradually increasing dataset
data <- simulateMPRA(tr = seq(2,3,0.01), da=NULL)

YosefLab/MPRAnalyze documentation built on Nov. 14, 2020, 2:35 a.m.