Description Usage Arguments Details Value Author(s) References See Also Examples
This function performs a default analysis through the steps:
estimation of size factors: estimateSizeFactors
estimation of dispersion: estimateDispersions
Negative Binomial GLM fitting and Wald statistics: nbinomWaldTest
For complete details on each step, see the manual pages of the respective
functions. After the DESeq
function returns a DESeqDataSet object,
results tables (log2 fold changes and pvalues) can be generated
using the results
function.
Shrunken LFC can then be generated using the lfcShrink
function.
All support questions should be posted to the Bioconductor
support site: http://support.bioconductor.org.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  DESeq(
object,
test = c("Wald", "LRT"),
fitType = c("parametric", "local", "mean", "glmGamPoi"),
sfType = c("ratio", "poscounts", "iterate"),
betaPrior,
full = design(object),
reduced,
quiet = FALSE,
minReplicatesForReplace = 7,
modelMatrixType,
useT = FALSE,
minmu = if (fitType == "glmGamPoi") 1e06 else 0.5,
parallel = FALSE,
BPPARAM = bpparam()
)

object 
a DESeqDataSet object, see the constructor functions

test 
either "Wald" or "LRT", which will then use either
Wald significance tests (defined by 
fitType 
either "parametric", "local", "mean", or "glmGamPoi"
for the type of fitting of dispersions to the mean intensity.
See 
sfType 
either "ratio", "poscounts", or "iterate"
for the type of size factor estimation. See

betaPrior 
whether or not to put a zeromean normal prior on
the nonintercept coefficients
See 
full 
for 
reduced 
for 
quiet 
whether to print messages at each step 
minReplicatesForReplace 
the minimum number of replicates required
in order to use 
modelMatrixType 
either "standard" or "expanded", which describe
how the model matrix, X of the GLM formula is formed.
"standard" is as created by 
useT 
logical, passed to 
minmu 
lower bound on the estimated count for fitting genewise dispersion
and for use with 
parallel 
if FALSE, no parallelization. if TRUE, parallel
execution using 
BPPARAM 
an optional parameter object passed internally
to 
The differential expression analysis uses a generalized linear model of the form:
K_ij ~ NB(mu_ij, alpha_i)
mu_ij = s_j q_ij
log2(q_ij) = x_j. beta_i
where counts K_ij for gene i, sample j are modeled using
a Negative Binomial distribution with fitted mean mu_ij
and a genespecific dispersion parameter alpha_i.
The fitted mean is composed of a samplespecific size factor
s_j and a parameter q_ij proportional to the
expected true concentration of fragments for sample j.
The coefficients beta_i give the log2 fold changes for gene i for each
column of the model matrix X.
The samplespecific size factors can be replaced by
genespecific normalization factors for each sample using
normalizationFactors
.
For details on the fitting of the log2 fold changes and calculation of pvalues,
see nbinomWaldTest
if using test="Wald"
,
or nbinomLRT
if using test="LRT"
.
Experiments without replicates do not allow for estimation of the dispersion of counts around the expected value for each group, which is critical for differential expression analysis. Analysis without replicates was deprecated in v1.20 and is no longer supported since v1.22.
The argument minReplicatesForReplace
is used to decide which samples
are eligible for automatic replacement in the case of extreme Cook's distance.
By default, DESeq
will replace outliers if the Cook's distance is
large for a sample which has 7 or more replicates (including itself).
This replacement is performed by the replaceOutliers
function. This default behavior helps to prevent filtering genes
based on Cook's distance when there are many degrees of freedom.
See results
for more information about filtering using
Cook's distance, and the 'Dealing with outliers' section of the vignette.
Unlike the behavior of replaceOutliers
, here original counts are
kept in the matrix returned by counts
, original Cook's
distances are kept in assays(dds)[["cooks"]]
, and the replacement
counts used for fitting are kept in assays(dds)[["replaceCounts"]]
.
Note that if a log2 fold change prior is used (betaPrior=TRUE)
then expanded model matrices will be used in fitting. These are
described in nbinomWaldTest
and in the vignette. The
contrast
argument of results
should be used for
generating results tables.
a DESeqDataSet
object with results stored as
metadata columns. These results should accessed by calling the results
function. By default this will return the log2 fold changes and pvalues for the last
variable in the design formula. See results
for how to access results
for other variables.
Michael Love
Love, M.I., Huber, W., Anders, S. (2014) Moderated estimation of fold change and dispersion for RNAseq data with DESeq2. Genome Biology, 15:550. https://doi.org/10.1186/s1305901405508
For fitType="glmGamPoi"
:
AhlmannEltze, C., Huber, W. (2020) glmGamPoi: Fitting GammaPoisson Generalized Linear Models on Single Cell Count Data. bioRxiv. https://doi.org/10.1101/2020.08.13.249623
link{results}
, lfcShrink
, nbinomWaldTest
, nbinomLRT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  # see vignette for suggestions on generating
# count tables from RNASeq data
cnts < matrix(rnbinom(n=1000, mu=100, size=1/0.5), ncol=10)
cond < factor(rep(1:2, each=5))
# object construction
dds < DESeqDataSetFromMatrix(cnts, DataFrame(cond), ~ cond)
# standard analysis
dds < DESeq(dds)
res < results(dds)
# moderated log2 fold changes
resultsNames(dds)
resLFC < lfcShrink(dds, coef=2, type="apeglm")
# an alternate analysis: likelihood ratio test
ddsLRT < DESeq(dds, test="LRT", reduced= ~ 1)
resLRT < results(ddsLRT)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.