negbinomsig: assess significance of sliding-window read counts
In girafe: Genome Intervals and Read Alignments for Functional Exploration

Description Usage Arguments Details Value Author(s) References See Also Examples

This function can be used to assess the significance of sliding-window read counts. The background distribution of read counts in windows is assumed to be a Negative-Binomial (NB) one. The two parameters of the NB distribution, mean ‘mu’ and dispersion ‘size’, are estimated using any of the methods described below (see details). The estimated NB distribution is used to assign a p-value to each window based on the number of aligned reads in the window. The p-values can be corrected for multiple testing using any of the correction methods implemented for p.adjust.

1	addNBSignificance(x, estimate="NB.012", correct = "none", max.n=10L)

`x`	A `data.frame` of class `slidingWindowSummary`, as returned by the function `perWindow`.
`estimate`	string; which method to use to estimate the parameters of the NB background distribution; see below for details
`correct`	string; which method to use for p-value adjustment; can be any method that is implemented for `p.adjust` including “none” if no correction is desired.
`max.n`	integer; only relevant if `estimate=="NB.ML"`; in that case specifies that windows with up to this number of aligned reads should be considered for estimating the background distribution.

The two parameters of the Negative-Binomial (NB) distribution are: mean ‘lambda’ (or ‘mu’) and size ‘r’ (or ‘size’).

The function knows a number of methods to estimate the parameters of the NB distribution.

“NB.012”

Solely the windows with only 0, 1, or 2 aligned reads are used for estimating lambda and ‘r’. From the probability mass function g(k)=P(X=k) of the NB distribution, it follows that the ratios

q_1 = g(1)/g(0) = lambda r/(lambda+r)

and

q_2 = g(2)/g(1) = lambda (r+1)/(2 (lambda+r)).

The observed numbers of windows with 0-2 aligned reads are used to estimate

q_1 = n_1/n_0

and

q_2 = n_2/n_1

and from these estimates, one can obtain estimates for 'lambda' and 'r'.

“NB.ML”

This estimation method uses the function fitdistr from package ‘MASS’. Windows with up to n.max aligned reads are considered for this estimate.

“Poisson”

This estimate also uses the windows the 0-2 aligned reads, but uses these numbers to estimates the parameter lambda of a Poisson distribution. The parameter ‘r’ is set to a very large number, such that the estimated NB distribution actually is a Poisson distribution with mean and variance equal to lambda.

A data.frame of class slidingWindowSummary, which is the the supplied argument x extended by an additional column p.value which holds the p-value for each window. The attribute NBparams of the result contains the list of the estimated parameters of the Negative-Binomial background distribution.

Joern Toedling

Such an estimation of the Negative-Binomial parameters has also been described in the paper:
Ji et al.(2008) An integrated system CisGenome for analyzing ChIP-chip and ChIP-seq data. Nat Biotechnol. 26(11):1293-1300.

perWindow, p.adjust

 exDir <- system.file("extdata", package="girafe")
 exA   <- readAligned(dirPath=exDir, type="Bowtie",
   pattern="aravinSRNA_23_no_adapter_excerpt_mm9_unmasked.bwtmap")
 exAI  <- as(exA, "AlignedGenomeIntervals")
 exPX  <- perWindow(exAI, chr="chrX", winsize=1e5, step=0.5e5)
 exPX  <- addNBSignificance(exPX, correct="bonferroni")
 str(exPX)
 exPX[exPX$p.value <= 0.05,]