Estimate Negative Binomial Dispersion
Description
Estimate NB dispersion by modeling it as a parametric function of preliminarily estimated log mean relative frequencies.
Usage
1  estimate.dispersion(nb.data, x, model = "NBQ", method = "MAPL", ...)

Arguments
nb.data 
output from

x 
a design matrix specifying the mean structure of each row. 
model 
the name of the dispersion model, one of "NB2", "NBP", "NBQ" (default), "NBS" or "step". 
method 
a character string specifying the method for estimating the dispersion model, one of "ML" or "MAPL" (default). 
... 
(for future use). 
Details
We use a negative binomial (NB) distribution to model the read frequency of gene i in sample j. A negative binomial (NB) distribution uses a dispersion parameter φ_{ij} to model the extraPoisson variation between biological replicates. Under the NB model, the meanvariance relationship of a single read count satisfies σ_{ij}^2 = μ_{ij} + φ_{ij} μ_{ij}^2. Due to the typically small sample sizes of RNASeq experiments, estimating the NB dispersion φ_{ij} for each gene i separately is not reliable. One can pool information across genes and biological samples by modeling φ_{ij} as a function of the mean frequencies and library sizes.
Under the NB2 model, the dispersion is a constant across all genes and samples.
Under the NBP model, the log dispersion is modeled as a
linear function of the preliminary estimates of the log
mean relative frequencies (pi.pre
):
log(phi) = par[1] + par[2] * log(pi.pre/pi.offset),
where pi.offset
is 1e4.
Under the NBQ model, the dispersion is modeled as a quadratic function of the preliminary estimates of the log mean relative frequencies (pi.pre):
log(phi) = par[1] + par[2] * z + par[3] * z^2,
where z = log(pi.pre/pi.offset). By default, pi.offset is the median of pi.pre[subset,].
Under this NBS model, the dispersion is modeled as a smooth function (a natural cubic spline function) of the preliminary estimates of the log mean relative frequencies (pi.pre).
Under the "step" model, the dispersion is modeled as a step (piecewise constant) function.
Value
a list with following components:
estimates 
dispersion estimates for each read count,
a matrix of the same dimensions as the 
likelihood 
the likelihood of the fitted model. 
model 
details of the estimate dispersion model, NOT intended for use by end users. The name and contents of this component are subject to change in future versions. 
Note
Currently, it is unclear whether a dispersionmodeling approach will outperform a more basic approach where regression model is fitted to each gene separately without considering the dispersionmean dependence. Clarifying the powerrobustness of the dispersionmodeling approach is an ongoing research topic.
Examples
1  ## See the example for test.coefficient.
