analyzeMPRA: Analyze MPRA data using specified tests

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/analyzeMPRA.R

Description

This function pre-processes the given MPRA data, and analyzes it using specified tests.

Usage

1
analyzeMPRA(datt, nrepIn, rnaCol, nrepOut, nsim, ntag, method = c("MW", "Matching", "Adaptive", "Fisher", "QuASAR", "T-test", "mpralm", "edgeR", "DESeq2"), cutoff = -1, cutoffo = -1, matched=FALSE)

Arguments

datt

A data frame containing the MPRA dataset. It should have nsim*ntag*2 rows and 2+nrepIn+nrepOut columns. The first column should be named 'allele', and the second column should be named 'simN'. The 'allele' columns should contain only two possible values 'Ref' and 'Mut' to refer to the two versions of alleles for each SNP.

nrepIn

An integer indicating the number of DNA replicates.

rnaCol

An integer indicating the starting column of the RNA replicates in datt.

nrepOut

An integer indicating the number of RNA replicates.

nsim

An integer indicating the number of SNPs/comparisons in the MPRA data. A comparison refer to the unit with two alleles for testing allele-specific expression.

ntag

An integer indicating the number of tags/barcodes for each allele.

method

A vector of characters specifying the tests to be used. The possible options are: MW, Matching, Adaptive, Fisher, QuASAR, T-test, mpralm, edgeR and DESeq2.

cutoff

A numeric or integer value. Tags with DNA count less than or equal to cutoff in any of the DNA replicates will be removed.

cutoffo

A numeric or integer value. Tags with RNA count less than or equal to cutoffo in any of the RNA replicates will be removed.

matched

Whether the DNA samples are matching to the RNA samples in the correct order in the data frame datt.

Details

This function first normalizes the replicates using the maximum depth across all replicates, and filters the tags according to the specified cutoffs. Then it analyzes the processed MPRA data using the specified tests.

Value

results

The actual p-values of all the SNPs for the specified tests using the given dataset.

Author(s)

Dandi Qiao

References

Qiao, D., Zigler, C., Cho, M.H., Silverman, E.K., Zhou, X., et al. (2018). Statistical considerations for the analysis of massively parallel reporter assays data.

See Also

atMPRA atMPRA atMPRA atMPRA

Examples

1
2
3
4
5
6
7
nsim = 10
ntag = 10
slope=c(rep(1, ntag*nsim), rep(1.5, ntag*nsim)) 
nrep=5
data(datMean)
simData = sim_fixTotalD(datMean=datMean, totalDepth=200000, sigma2_DNA_a0=0.001, sigma2_DNA_a1=0.23, sigma2_RNA_a0=0.18, sigma2_RNA_a1=35, ntag=ntag, nsim=nsim, nrep=nrep, slope=slope)
results = analyzeMPRA(simData, nrep, rnaCol=2+nrep+1, nrep, nsim, ntag, method=c("MW","mpralm", "edgeR", "DESeq2"), cutoff=0, cutoffo=0)

redaq/atMPRA documentation built on July 24, 2020, 2:40 a.m.