runMBASED2s1aseID: Function that runs between-sample (differential) ASE calling...

Description Usage Arguments Value Examples

Description

Function that runs between-sample (differential) ASE calling using data from loci (SNVs) within a single unit of ASE (gene). The i-th entry of each of vector arguments 'lociAllele1CountsSample1', 'lociAllele2CountsSample1', 'lociAllele1NoASEProbsSample1', 'lociRhosSample1', 'lociAllele1CountsSample2', 'lociAllele2CountsSample2', 'lociAllele1NoASEProbsSample2', and 'lociRhosSample2' should correspond to the i-th locus. If argument 'isPhased' (see below) is true, then entries corresponding to allele1 at each locus must represent the same haplotype. Note: for each locus in each sample, at least one allele should have >0 supporting reads.

Usage

1
2
3
4
5
runMBASED2s1aseID(lociAllele1CountsSample1, lociAllele2CountsSample1,
  lociAllele1CountsSample2, lociAllele2CountsSample2,
  lociAllele1NoASEProbsSample1, lociAllele1NoASEProbsSample2, lociRhosSample1,
  lociRhosSample2, numSim = 0, isPhased = FALSE, tieBreakRandom = FALSE,
  checkArgs = FALSE)

Arguments

lociAllele1CountsSample1,lociAllele2CountsSample1,lociAllele1CountsSample2,lociAllele2CountsSample2

vectors of counts of allele1 (e.g. reference) and allele2 (e.g. alternative) at individiual loci in sample1 and sample2. Allele counts are not necessarily phased (see argument 'isPhased'), so allele1 counts may not represent the same haplotype. However, the two alleles (allele1 and allele2) must be defined identically for both samples at each locus. All 4 arguments must be vectors of non-negative integers.

lociAllele1NoASEProbsSample1,lociAllele1NoASEProbsSample2

probabilities of observing allele1-supporting reads at individual loci under conditions of no ASE (e.g., vector with all entries set to 0.5, if there is no pre-existing allelic bias at any locus) in sample1 and sample2, respectively. Note that these probabilities are allowed to be sample-specific. Each argument must be a vector with entries >0 and <1.

lociRhosSample1,lociRhosSample2

dispersion parameters of beta distribution at individual loci (set to 0 if the read count-generating distribution at the locus is binomial). Note that the dispersions are allowed to be sample-specific. Each argument must be a vector with entries >=0 and <1.

numSim

number of simulations to perform. Must be a non-negative integer. If 0 (DEFAULT), no simulations are performed.

isPhased

single boolean specifying whether the phasing has already been performed, in which case the lociAllele1CountsSample1 (and, therefore, lociAllele1CountsSample2) represent the same haplotype. DEFAULT: FALSE.

tieBreakRandom

single boolean specifying how ties should be broken during pseudo-phasing in cases of unphased data (isPhased=FALSE). If TRUE, each of the two allele will be assigned to major haplotype with probability=0.5. If FALSE (DEFAULT), allele1 will be assigned to major haplotype and allele2 to minor haplotype.

checkArgs

single boolean specifying whether arguments should be checked for adherence to specifications. DEFAULT: FALSE

Value

list with 7 elements

majorAlleleFrequencyDifference

Estimate of major allele frequency difference for this unit of ASE (gene). 'Major' here refers to the allelic imbalance within sample1, and the difference is defined as Frequency(major, sample1)-Frequency(major, sample2).

pValueASE

Estimate of p-value for observed extent of ASE (nominal if no simulations are performed, simulations-based otherwise).

heterogeneityQ

Statistic summarizing variability of locus-specific estimates of major allele frequency difference if >1 locus is present. Set to NA for single-locus cases.

pValueHeterogeneity

Estimate of p-value for observed extent of variability of locus-specific estimates of major allele frequency difference if >1 locus is present. Set to NA for single-locus cases.

lociAllele1IsMajor

Vector of booleans, specifying for each locus whether allele1 is assigned to major (TRUE) or minor (FALSE) haplotype (where 'major' and 'minor' refer to abundances in sample1). If the data is phased (isPhased=TRUE), then all elements of the vector are TRUE if haplotype 1 is found to be major in sample1, and are all FALSE if haplotype 1 is found to be minor. In cases of unphased data (isPhased=FALSE), the assignment is provided by the pseudo-phasing procedure within sample1.

nullHypothesisMAF

Estimate of major allele frequency under the null hypothesis that allelic frequencies are the same in both samples. This estimate is obtained by maximum likelihood, and, in case of unphased data (isPhased=FALSE), the likelihood is further maximized over all possible assignments of alleles to haplotypes.

lociMAFDifference

Estimate of the difference of major allele (haplotype) frequency at individual loci. Note that 'major' and 'minor' distinction is made at the level of gene haplotype in sample1.

Examples

1
2
3
4
5
SNVCoverageTumor=sample(10:100, 5) ## gene with 5 loci
SNVCoverageNormal=sample(10:100, 5)
SNVAllele1CountsTumor=rbinom(length(SNVCoverageTumor), SNVCoverageTumor, 0.5)
SNVAllele1CountsNormal=rbinom(length(SNVCoverageNormal), SNVCoverageNormal, 0.5)
MBASED:::runMBASED2s1aseID(lociAllele1CountsSample1=SNVAllele1CountsTumor, lociAllele2CountsSample1=SNVCoverageTumor-SNVAllele1CountsTumor, lociAllele1CountsSample2=SNVAllele1CountsNormal, lociAllele2CountsSample2=SNVCoverageNormal-SNVAllele1CountsNormal, lociAllele1NoASEProbsSample1=rep(0.5, length(SNVCoverageTumor)), lociAllele1NoASEProbsSample2=rep(0.5, length(SNVCoverageNormal)), lociRhosSample1=rep(0, length(SNVCoverageTumor)), lociRhosSample2=rep(0, length(SNVCoverageNormal)), numSim=10^6, isPhased=FALSE)

MBASED documentation built on Nov. 8, 2020, 5:53 p.m.