simulate_read_count_multi: Simulate read count (multi-SNP model)

Description Usage Arguments Value About sampling Y^h Examples

View source: R/simulation_multi.R

Description

Given gene configuration, genotype (mutliple SNP), range of variation due to genetic effect, and other parameters, simulate read count. Essentially, the function first samples the true effect according to the number of causal variants and variation due to genetic effect. And then, it follows the sampling scheme mostly resembles the single-SNP scenario.

Usage

1
simulate_read_count_multi(gene, genotype, betas, L_read, y_dist)

Arguments

gene

gene instance generated from create_gene

genotype

genotype generated from create_genotype (the 'genotype' object in the list)

betas

log(aFC) vector

L_read

length of read

y_dist

distribution of Y^h | library size, relative abundance. It specifies the shape of the distribution whereas the mean is given by library size and relative abundance.

Value

read counts, where observed count include y1, y2 (AS count of each haplotypes) and ystar (total count - y1 - y2) along with library size Ti_lib. Also, it includes unobserved count as 'hidden' (y1star and y2star) and the true effect size vector.

About sampling Y^h

To specify read count of each haplotype (Y^h) given library size and relative abundance. It is set in y_dist. Currently it supports three distribution types: 'poisson', 'lognormal', and 'negbinom'. Specifically, y_dist is a list with distribution name in 'type'. For y_dist$type = 'poisson', no other parameter is required. For y_dist$type = 'lognormal', 'sigma' needs to specify, and Y^h = round(library_size * relative_abundance * rlnorm(1, 0, y_dist$sigma)) For y_dist$type = 'negbinom', 'prob' and 'size_factor' need to specify, and Y^h = rnbinom(1, size = y_dist$size_factor * library_size * relative_abundance, prob = y_dist$prob)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
gene = create_gene()
genotype = list(
  h1 = matrix(
    sample(c(0, 1), 1000, replace = TRUE),
    ncol = 100,
    nrow = 10
  ),
  h2 = matrix(
    sample(c(0, 1), 1000, replace = TRUE),
    ncol = 100,
    nrow = 10
  )
)
betas = create_betas(
  maf = colMeans(genotype$h1 + genotype$h2) / 2,
  genetic_var = c(0.015, 0.075),
  ncausal = c(1, 3)
)
simulate_read_count_multi(
  gene = gene,
  genotype = genotype,
  betas = betas,
  L_read = 75,
  y_dist = list(
    type = 'negbinom',
    prob = 2/3,
    size_factor = 2
  )
)

liangyy/mixqtl documentation built on Sept. 17, 2020, 11:36 a.m.