MBASEDVectorizedMetaprop: Vectorized wrapper around metaprop() function from R package...

Description Usage Arguments Details Value Examples

Description

Vectorized wrapper around metaprop() function from R package "meta" with some modifications and extensions to beta-binomial count models.

Usage

1
2
MBASEDVectorizedMetaprop(countsMat, totalsMat, probsMat, rhosMat,
  alternative = "two.sided", checkArgs = FALSE)

Arguments

countsMat

matrix of observed major allele counts. Each row represents a specific genomic locus, while each column represents a set of observed major allele counts across loci (in practice, multiple columns represent different outcomes of count simulations).

totalsMat

matrix of total read counts across both alleles. The interpretation of rows and columns is the same as for countsMat.

probsMat

matrix of probabilities of success (means of beta distributions in case of beta-binomial extensions). The interpretation of rows and columns is the same as for countsMat.

rhosMat

matrix of dispersion parameters of beta distribution in case of beta-binomial counts. The interpretation of rows and columns is the same as for countsMat.

alternative

one of 'two.sided', 'greater', 'less'. DEFAULT: 'two.sided'.

checkArgs

single boolean specifying whether arguments should be checked for adherence to specifications. DEFAULT: FALSE

Details

MBASEDVectorizedMetaprop performs computations similar to metaprop() with default options (fixed-effects only), but with less overhang, in a vectorized fashion, and accomodating extensions to beta-binomial distribution. It also allows the input counts to come from loci with different underlying binomial probabilities (means of beta distribution, in cases of beta-binomial extensions). One technical difference is the way the value of 'n' is calculated for Freeman-Tukey back-transformation of average 'z' into a proportion. While metaprop() uses harmonic mean of n's at individual loci (which puts more weight toward loci with small read counts), we use the weighted mean of n's with weights proportional to n's, by analogy with how the value of average 'z' is calculated from individual z's. Input matrices countsMat and totalsMat have one column for each set of loci ('independent studies') to be combined, with each row corresponding to an individual locus. Matrix probsMat provides the underlying binomial probabilities (means of beta distributions, in case of beta-binomial extensions), while matrix rhosMat gives the values of the dispersion at the loci. MBASEDVectorizedMetaprop uses meta analysis approach with Freeman-Tukey transformation to report for each set of loci (each column) its estimated overall proportion on both transformed and untransformed scale, corresponding standard errors (on transformed scale), z-values (based on expected value of 2*asin(sqrt(0.5)) on transformed scale under the null hypothesis of common underlying proportion (binomial probability or mean of beta distribution) of 0.5), and corresponding p-values based on imposing normal distribution assumption on z-values, where alternative hypothesis of 'two.sided', 'greater', and 'less' can be specified, with the latter two specified w.r.t. 2*asin(sqrt(0.5)). If some of the supplied entries in probsMat are different from 0.5, then the corresponding transformed proportions are shifted, so that the new mean for the resulting z-values is still approximately 2*asin(sqrt(0.5)). Extensions to beta-binomial counts are accomodated by increasing the variance of each individual z from 1/(n+0.5) to rho+1/(n+0.5), where rho is the dispersion parameter of the beta distribution. The function is used to cacluate p-values in ASE settings, where countsMat represents major allele counts, totalsMat represents total allele counts, probsMat represents the underlying binomial probabilities of observing major allele-supporting read (means of beta distributions in case of beta-binomial extensions), which may be different, e.g., for major allele counts coming from reference and alternative alleles in case of pre-existing allelic bias, and rhosMat provides values of dispersion parameter for beta-binomial counts (0, in case of binomial model) for individual loci.

Value

a list with 7 elements:

hetPVal

a 1-row marix of heterogeneity p-values.

hetQ

a 1-row matrix of heterogeneity statistics.

TEFinal

a 1-row matrix of estimated proportions on transformed scale.

seTEFinal

a 1-row matrix of estimated SEs of proportion estimates on transformed scale.

propFinal

a 1-row matrix of estimated proportions on 0-1 scale.

pValue

a 1-row matrix of corresponding p-values.

propLoci

a matrix of same dimension as original input matrices giving estimated proportions on transformed scale at each individual locus.

Examples

1
2
3
4
5
SNVCoverage=rep(sample(10:100,5),2) ## 2 genes with 5 loci each
SNVAllele1Counts=rbinom(length(SNVCoverage), SNVCoverage, 0.5)
SNVMajorAlleleCounts=pmax(SNVAllele1Counts, SNVCoverage-SNVAllele1Counts)
MBASED:::MBASEDVectorizedMetaprop(countsMat=matrix(SNVAllele1Counts, ncol=2), totalsMat=matrix(SNVCoverage, ncol=2), probsMat=matrix(rep(0.5, length(SNVCoverage)), ncol=2), rhosMat=matrix(rep(0, length(SNVCoverage)), ncol=2), alternative='two.sided') ## ideal situation when phasing is known
MBASED:::MBASEDVectorizedMetaprop(countsMat=matrix(SNVMajorAlleleCounts, ncol=2), totalsMat=matrix(SNVCoverage, ncol=2), probsMat=matrix(rep(0.5, length(SNVCoverage)), ncol=2), rhosMat=matrix(rep(0, length(SNVCoverage)), ncol=2), alternative='two.sided') ## what happens if we put all major alleles together into a single haplotype and obtain nominal p-value

MBASED documentation built on Nov. 8, 2020, 5:53 p.m.