simulate_read_count_multi: Simulate read count (multi-SNP model)
In liangyy/mixqtl: Combine total and allele-specific counts for cis-QTL analysis

Description Usage Arguments Value About sampling Y^h Examples

Given gene configuration, genotype (mutliple SNP), range of variation due to genetic effect, and other parameters, simulate read count. Essentially, the function first samples the true effect according to the number of causal variants and variation due to genetic effect. And then, it follows the sampling scheme mostly resembles the single-SNP scenario.

1	simulate_read_count_multi(gene, genotype, betas, L_read, y_dist)

`gene`	gene instance generated from `create_gene`
`genotype`	genotype generated from `create_genotype` (the 'genotype' object in the list)
`betas`	log(aFC) vector
`L_read`	length of read
`y_dist`	distribution of Y^h \| library size, relative abundance. It specifies the shape of the distribution whereas the mean is given by library size and relative abundance.

read counts, where observed count include y1, y2 (AS count of each haplotypes) and ystar (total count - y1 - y2) along with library size Ti_lib. Also, it includes unobserved count as 'hidden' (y1star and y2star) and the true effect size vector.

To specify read count of each haplotype (Y^h) given library size and relative abundance. It is set in y_dist. Currently it supports three distribution types: 'poisson', 'lognormal', and 'negbinom'. Specifically, y_dist is a list with distribution name in 'type'. For y_dist$type = 'poisson', no other parameter is required. For y_dist$type = 'lognormal', 'sigma' needs to specify, and Y^h = round(library_size * relative_abundance * rlnorm(1, 0, y_dist$sigma)) For y_dist$type = 'negbinom', 'prob' and 'size_factor' need to specify, and Y^h = rnbinom(1, size = y_dist$size_factor * library_size * relative_abundance, prob = y_dist$prob)

gene = create_gene()
genotype = list(
  h1 = matrix(
    sample(c(0, 1), 1000, replace = TRUE),
    ncol = 100,
    nrow = 10
  ),
  h2 = matrix(
    sample(c(0, 1), 1000, replace = TRUE),
    ncol = 100,
    nrow = 10
  )
)
betas = create_betas(
  maf = colMeans(genotype$h1 + genotype$h2) / 2,
  genetic_var = c(0.015, 0.075),
  ncausal = c(1, 3)
)
simulate_read_count_multi(
  gene = gene,
  genotype = genotype,
  betas = betas,
  L_read = 75,
  y_dist = list(
    type = 'negbinom',
    prob = 2/3,
    size_factor = 2
  )
)