estimateDispersions: Estimate the dispersions for a DESeqDataSet
In aghozlane/DESeq2shaman: Differential gene expression analysis based on the negative binomial distribution

Description Usage Arguments Details Value References Examples

This function obtains dispersion estimates for Negative Binomial distributed data.

## S4 method for signature 'DESeqDataSet'
estimateDispersions(object,fitType=c("parametric","local","mean"),maxit=100, quiet=FALSE)

## S4 method for signature 'DESeqDataSet'
estimateDispersions(object, fitType = c("parametric",
  "local", "mean"), maxit = 100, quiet = FALSE)

`object`	a DESeqDataSet
`fitType`	either "parametric", "local", or "mean" for the type of fitting of dispersions to the mean intensity. parametric - fit a dispersion-mean relation of the form: dispersion = asymptDisp + extraPois / mean via a robust gamma-family GLM. The coefficients `asymptDisp` and `extraPois` are given in the attribute `coefficients` of the `dispersionFunction` of the object. local - use the locfit package to fit a local regression of log dispersions over log base mean (normal scale means and dispersions are input and output for `dispersionFunction`). The points are weighted by normalized mean count in the local regression. mean - use the mean of gene-wise dispersion estimates.
`maxit`	control parameter: maximum number of iterations to allow for convergence
`quiet`	whether to print messages at each step

Typically the function is called with the idiom:

dds <- estimateDispersions(dds)

The fitting proceeds as follows: for each gene, an estimate of the dispersion is found which maximizes the Cox Reid-adjusted profile likelihood (the methods of Cox Reid-adjusted profile likelihood maximization for estimation of dispersion in RNA-Seq data were developed by McCarthy, et al. (2012), first implemented in the edgeR package in 2010); a trend line capturing the dispersion-mean relationship is fit to the maximum likelihood estimates; a normal prior is determined for the log dispersion estimates centered on the predicted value from the trended fit with variance equal to the difference between the observed variance of the log dispersion estimates and the expected sampling variance; finally maximum a posteriori dispersion estimates are returned. This final dispersion parameter is used in subsequent tests. The final dispersion estimates can be accessed from an object using dispersions. The fitted dispersion-mean relationship is also used in varianceStabilizingTransformation. All of the intermediate values (gene-wise dispersion estimates, fitted dispersion estimates from the trended fit, etc.) are stored in mcols(dds), with information about these columns in mcols(mcols(dds)).

The log normal prior on the dispersion parameter has been proposed by Wu, et al. (2012) and is also implemented in the DSS package.

In DESeq2, the dispersion estimation procedure described above replaces the different methods of dispersion from the previous version of the DESeq package.

estimateDispersions checks for the case of an analysis with as many samples as the number of coefficients to fit, and will temporarily substitute a design formula ~ 1 for the purposes of dispersion estimation. This treats the samples as replicates for the purpose of dispersion estimation. As mentioned in the DESeq paper: "While one may not want to draw strong conclusions from such an analysis, it may still be useful for exploration and hypothesis generation."

The lower-level functions called by estimateDispersions are: estimateDispersionsGeneEst, estimateDispersionsFit, and estimateDispersionsMAP.

The DESeqDataSet passed as parameters, with the dispersion information filled in as metadata columns, accessible via mcols, or the final dispersions accessible via dispersions.

Simon Anders, Wolfgang Huber: Differential expression analysis for sequence count data. Genome Biology 11 (2010) R106, http://dx.doi.org/10.1186/gb-2010-11-10-r106
McCarthy, DJ, Chen, Y, Smyth, GK: Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research 40 (2012), 4288-4297, http://dx.doi.org/10.1093/nar/gks042
Wu, H., Wang, C. & Wu, Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics (2012). http://dx.doi.org/10.1093/biostatistics/kxs033

dds <- makeExampleDESeqDataSet()
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
head(dispersions(dds))

aghozlane/DESeq2shaman documentation built on Nov. 1, 2019, 9:01 p.m.

aghozlane/DESeq2shaman index

Analyzing RNA-Seq data with the "DESeq2" package

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

aghozlane/DESeq2shaman
Differential gene expression analysis based on the negative binomial distribution

estimateDispersions: Estimate the dispersions for a DESeqDataSet
In aghozlane/DESeq2shaman: Differential gene expression analysis based on the negative binomial distribution

Description

Usage

Arguments

Details

Value

References

Examples

Related to estimateDispersions in aghozlane/DESeq2shaman...

R Package Documentation

Browse R Packages

We want your feedback!

aghozlane/DESeq2shaman Differential gene expression analysis based on the negative binomial distribution

estimateDispersions: Estimate the dispersions for a DESeqDataSet In aghozlane/DESeq2shaman: Differential gene expression analysis based on the negative binomial distribution

Description

Usage

Arguments

Details

Value

References

Examples

Related to estimateDispersions in aghozlane/DESeq2shaman...

R Package Documentation

Browse R Packages

We want your feedback!

aghozlane/DESeq2shaman
Differential gene expression analysis based on the negative binomial distribution

estimateDispersions: Estimate the dispersions for a DESeqDataSet
In aghozlane/DESeq2shaman: Differential gene expression analysis based on the negative binomial distribution