nbinomWaldTest: Wald test for the GLM coefficients

Description Usage Arguments Details Value See Also Examples

View source: R/core.R

Description

This function tests for significance of coefficients in a Negative Binomial GLM, using previously calculated sizeFactors (or normalizationFactors) and dispersion estimates. See DESeq for the GLM formula.

Usage

1
2
3
nbinomWaldTest(object, betaPrior = TRUE, betaPriorVar, modelMatrixType,
  maxit = 100, useOptim = TRUE, quiet = FALSE, useT = FALSE, df,
  useQR = TRUE)

Arguments

object

a DESeqDataSet

betaPrior

whether or not to put a zero-mean normal prior on the non-intercept coefficients

betaPriorVar

a vector with length equal to the number of model terms including the intercept. betaPriorVar gives the variance of the prior on the sample betas on the log2 scale. if missing (default) this is estimated from the data

modelMatrixType

either "standard" or "expanded", which describe how the model matrix, X of the formula in DESeq, is formed. "standard" is as created by model.matrix using the design formula. "expanded" includes an indicator variable for each level of factors in addition to an intercept. betaPrior must be set to TRUE in order for expanded model matrices to be fit.

maxit

the maximum number of iterations to allow for convergence of the coefficient vector

useOptim

whether to use the native optim function on rows which do not converge within maxit

quiet

whether to print messages at each step

useT

whether to use a t-distribution as a null distribution, for significance testing of the Wald statistics. If FALSE, a standard normal null distribution is used.

df

the degrees of freedom for the t-distribution

useQR

whether to use the QR decomposition on the design matrix X while fitting the GLM

Details

The fitting proceeds as follows: standard maximum likelihood estimates for GLM coefficients (synonymous with beta, log2 fold change) are calculated. A zero-mean normal prior distribution is assumed. The variance of the prior distribution for each non-intercept coefficient is calculated using the observed distribution of the maximum likelihood coefficients. The final coefficients are then maximum a posteriori estimates using this prior (Tikhonov/ridge regularization). See below for details on the prior variance and the Methods section of the DESeq2 manuscript for more detail. The use of a prior has little effect on genes with high counts and helps to moderate the large spread in coefficients for genes with low counts. For calculating Wald test p-values, the coefficients are scaled by their standard errors and then compared to a normal distribution.

The prior variance is calculated by matching the 0.05 upper quantile of the observed MLE coefficients to a zero-centered Normal distribution. Furthermore, the weighted upper quantile is calculated using the wtd.quantile function from the Hmisc package. The weights are given by 1/mu-bar + alpha_tr using the mean of normalized counts and the trended dispersion fit. The weighting ensures that large log fold changes with small sampling variance contribute the most to the estimation of the width of the prior. The prior variance for a factor level is the average over all contrasts of all factor levels.

When a log2 fold change prior is used (betaPrior=TRUE), then nbinomWaldTest will by default use expanded model matrices, as described in the modelMatrixType argument, unless this argument is used to override the default behavior or unless there the design contains 2 level factors and an interaction term. This ensures that log2 fold changes will be independent of the choice of base level. In this case, the beta prior variance for each factor is calculated as the average of the mean squared maximum likelihood estimates for each level and every possible contrast. The results function without any arguments will automatically perform a contrast of the last level of the last variable in the design formula over the first level. The contrast argument of the results function can be used to generate other comparisons.

When interaction terms are present, the prior on log fold changes will only be used for the interaction terms (non-interaction log fold changes receive a wide prior variance of 1000).

The Wald test can be replaced with the nbinomLRT for an alternative test of significance.

Value

a DESeqDataSet with results columns accessible with the results function. The coefficients and standard errors are reported on a log2 scale.

See Also

DESeq, nbinomLRT

Examples

1
2
3
4
5
dds <- makeExampleDESeqDataSet()
dds <- estimateSizeFactors(dds)
dds <- estimateDispersions(dds)
dds <- nbinomWaldTest(dds)
res <- results(dds)

aghozlane/DESeq2shaman documentation built on Nov. 1, 2019, 9:01 p.m.