Description Usage Arguments Value Author(s) References See Also Examples
This function simulates an MPRA dataset with specified input distribution and total depth across tags/barcodes.
1 |
inputProp |
A vector of numeric values indicating the input DNA count distribution, which is a pool of relative abundance of reads across tags. It should include 2*ntag*nsim proportions to indicate the proportion of reads correspond to each tag. |
ntag |
An integer indicating the number of tags/barcodes for each oligonucleotide (oligos) or each allele. |
nsim |
An integer indicating the number of simulations or number of SNPs included in the dataset. |
nrepIn |
An integer indicating the number of replicates for the DNA input. |
nrepOut |
An integer indicating the number of replicates for the RNA output. |
slope |
A numeric vector of length 2* |
inputDispFunc |
Optional parameter that provides a dispersion function estimated for the input replicates using DESeq2. |
outputDispFunc |
Optional parameter that provides a dispersion function estimated for the output replicates using DESeq2. |
sampleDepth |
An integer vector specifying the total read depth over all tags. It could be of length 1 or length nrepIn+nrepOut. If it is of length 1, the same total depth is used for all DNA and RNA replicates. If this is specified, values for |
inputDispParam |
This parameter is required if inputDispFunc is not provided. It should give three parameter estimates for the dispersion function of the DNA input replicates. The three parameters correspond to a0, a1, and d2, which specify that the dispersion parameter is a lognormal distribution with mean log(a0+a1/mu) and sd d2, where mu is the mean of DNA count across the replicates. |
outputDispParam |
This parameter is required if outputDispFunc is not provided. It should give three parameter estimates for the dispersion function of the RNA output replicates. The three parameters correspond to a0, a1, and d2, which specify that the dispersion parameter is a lognormal distribution with mean log(a0+a1/mu) and sd d2, where mu is the mean of RNA count across the replicates. |
meanDepth |
An integer vector specifying the mean read depth over all tags. It could be of length 1 or length nrepIn+nrepOut. If it is of length 1, the same mean depth is used for all DNA and RNA replicates. |
datt |
A simulated data frame with ntag*nsim*2 number of rows and 2+nrep*2 number of columns. The first two columns are the allele and SNP name for each tag. The other columns are the generated DNA or RNA counts for the nrep replicates. |
Dandi Qiao
Qiao, D., Zigler, C., Cho, M.H., Silverman, E.K., Zhou, X., et al. (2018). Statistical considerations for the analysis of massively parallel reporter assays data.
1 2 3 4 5 6 7 8 9 10 11 12 13 | data(GSE70531_params)
inputDispFunc=getParam[[1]]
outputDispFunc=getParam[[2]]
totalDepth = 200000
ntag= 10
nsim= 10
nrepIn=5
nrepOut = 5
inputProp = getParam[[3]](runif(ntag*nsim*2))
slopel = getParam[[4]](runif(nsim*2))
slope = rep(slopel, each=ntag)
datt=sim_fixDepth(inputProp, ntag, nsim, nrepIn, nrepOut, slope, inputDispFunc=inputDispFunc, outputDispFunc=outputDispFunc, sampleDepth=totalDepth)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.