estimateTagwiseDisp: Estimate Empirical Bayes Tagwise Dispersion Values
In edgeR: Empirical Analysis of Digital Gene Expression Data in R

Description Usage Arguments Details Value Author(s) References See Also Examples

Estimates tagwise dispersion values by an empirical Bayes method based on weighted conditional maximum likelihood.

## S3 method for class 'DGEList'
estimateTagwiseDisp(y, prior.df=10, trend="movingave", span=NULL, method="grid", 
           grid.length=11, grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)
## Default S3 method:
estimateTagwiseDisp(y, group=NULL, lib.size=NULL, dispersion, AveLogCPM=NULL, 
           prior.df=10, trend="movingave", span=NULL, method="grid", grid.length=11, 
           grid.range=c(-6,6), tol=1e-06, verbose=FALSE, ...)

`y`	matrix of counts or a `DGEList` object.
`prior.df`	prior degrees of freedom.
`trend`	method for estimating dispersion trend. Possible values are `"movingave"` (default), `"loess"` and `"none"`.
`span`	width of the smoothing window, as a proportion of the data set.
`method`	method for maximizing the posterior likelihood. Possible values are `"grid"` (default) for interpolation on grid points or `"optimize"` to call the function of the same name.
`grid.length`	for `method="grid"`, the number of points on which the interpolation is applied for each tag.
`grid.range`	for `method="grid"`, the range of the grid points around the trend on a log2 scale.
`tol`	for `method="optimize"`, the tolerance for Newton-Rhapson iterations.
`verbose`	logical, if `TRUE` then diagnostic ouput is produced during the estimation process.
`group`	vector or factor giving the experimental group/condition for each library.
`lib.size`	numeric vector giving the total count (sequence depth) for each library.
`dispersion`	common dispersion estimate, used as an initial estimate for the tagwise estimates.
`AveLogCPM`	numeric vector giving average log2 counts per million for each tag
`...`	other arguments that are not currently used.

This function implements the empirical Bayes strategy proposed by Robinson and Smyth (2007) for estimating the tagwise negative binomial dispersions. The experimental design is assumed to be a oneway layout with one or more experimental groups. The empirical Bayes posterior is implemented as a conditional likelihood with tag-specific weights.

The prior values for the dispersions are determined by a global trend. The individual tagwise dispersions are then squeezed towards this trend. The prior degrees of freedom determines the weight given to the prior. The larger the prior degrees of freedom, the more the tagwise dispersions are squeezed towards the global trend. If the number of libraries is large, the prior becomes less important and the tagwise dispersion are determined more by the individual tagwise data.

If trend="none", then the prior dispersion is just a constant, the common dispersion. Otherwise, the trend is determined by a moving average (trend="movingave") or loess smoother applied to the tagwise conditional log-likelihood. method="loess" applies a loess curve of degree 0 as implemented in loessByCol.

method="optimize" is not recommended for routine use as it is very slow. It is included for testing purposes.

Note that the terms ‘tag’ and ‘gene’ are synonymous here. The function is only named ‘Tagwise’ for historical reasons.

estimateTagwiseDisp.DGEList adds the following components to the input DGEList object:

`prior.df`	prior degrees of freedom.
`prior.n`	estimate of the prior weight.
`tagwise.dispersion`	numeric vector of the tagwise dispersion estimates.
`span`	width of the smoothing window, in terms of proportion of the data set.

estimateTagwiseDisp.default returns a numeric vector of the tagwise dispersion estimates.

Mark Robinson, Davis McCarthy, Yunshun Chen and Gordon Smyth

Robinson, MD, and Smyth, GK (2007). Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881-2887. http://bioinformatics.oxfordjournals.org/content/23/21/2881

estimateCommonDisp is usually run before estimateTagwiseDisp.

movingAverageByCol and loessByCol implement the moving average or loess smoothers.

# True dispersion is 1/5=0.2
y <- matrix(rnbinom(250*4,mu=20,size=5),nrow=250,ncol=4)
dge <- DGEList(counts=y,group=c(1,1,2,2))
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)