glmTreat: Test for Differential Expression Relative to a Threshold

View source: R/glmTreat.R

glmTreatR Documentation

Test for Differential Expression Relative to a Threshold

Description

Conduct genewise statistical tests for a given coefficient or contrast relative to a specified fold-change threshold.

Usage

glmTreat(glmfit, coef = ncol(glmfit$design), contrast = NULL, lfc = log2(1.2),
         null = "interval")

Arguments

glmfit

a DGEGLM object, usually output from glmFit or glmQLFit.

coef

integer or character vector indicating which coefficients of the linear model are to be tested equal to zero. Values must be columns or column names of design. Defaults to the last coefficient. Ignored if contrast is specified.

contrast

numeric vector specifying the contrast of the linear model coefficients to be tested against the log2-fold-change threshold. Length must equal to the number of columns of design. If specified, then takes precedence over coef.

lfc

numeric scalar specifying the absolute value of the log2-fold change threshold above which differential expression is to be considered.

null

character string, choices are "worst.case" or "interval". If "worst.case", then the null hypothesis asssumes that the true logFC is on the boundary of the possible values, either at lfc or -lfc, whichever gives the largest p-value. This gives the most conservative results. If "interval", then the null hypotheses assumes the true logFC to belong to a bounded interval of possible values.

Details

glmTreat implements a test for differential expression relative to a minimum required fold-change threshold. Instead of testing for genes which have log-fold-changes different from zero, it tests whether the log2-fold-change is greater than lfc in absolute value. glmTreat is analogous to the TREAT approach developed by McCarthy and Smyth (2009) for microarrays.

Note that the lfc testing threshold used to define the null hypothesis is not the same as a log2-fold-change cutoff, as the observed log2-fold-change needs to substantially larger than lfc for the gene to be called as significant. In practice, modest values for lfc such as log2(1.1), log2(1.2) or log2(1.5) are usually the most useful. In practice, setting lfc=log2(1.2) or lfc=log2(1.5) will usually cause most differentially expressed genes to have estimated fold-changes of 2-fold or greater, depending on the sample size and precision of the experiment.

Note also that glmTreat constructs test statistics using the unshrunk log2-fold-changes(unshrunk.logFC) rather than the log2-fold-changes that are usually reported (logFC). If no shrinkage has been applied to the log-fold-changes, i.e., the glms were fitted with prior.count=0, then unshrunk.logFC and logFC are the same and the former is omitted from the output object.

glmTreat detects whether glmfit was produced by glmFit or glmQLFit. In the former case, it conducts a modified likelihood ratio test (LRT) against the fold-change threshold. In the latter case, it conducts a quasi-likelihood (QL) F-test against the threshold.

If lfc=0, then glmTreat is equivalent to glmLRT or glmQLFTest, depending on whether likelihood or quasi-likelihood is being used.

glmTreat with positive lfc gives larger p-values than would be obtained with lfc=0. If null="worst.case", then glmTreat conducts a test closely analogous to the treat function in the limma package. This conducts a test if which the null hypothesis puts the true logFC on the boundary of the [-lfc,lfc] interval closest to the observed logFC. If null="interval", then the null hypotheses assumes an interval of possible values for the true logFC. This approach is somewhat less conservative.

Note that, unlike other edgeR functions such as glmLRT and glmQLFTest, glmTreat can only accept a single contrast. If contrast is a matrix with multiple columns, then only the first column will be used.

Value

glmTreat produces an object of class DGELRT with the same components as for glmfit plus the following:

lfc

absolute value of the specified log2-fold-change threshold.

table

data frame with the same rows as glmfit containing the log2-fold-changes, average log2-counts per million and p-values, ready to be displayed by topTags.

comparison

character string describing the coefficient or the contrast being tested.

The data frame table contains the following columns:

logFC

shrunk log2-fold-change of expression between conditions being tested.

unshrunk.logFC

unshrunk log2-fold-change of expression between conditions being tested. Exists only when prior.count is not equal to 0 for glmfit.

logCPM

average log2-counts per million, the average taken over all libraries.

PValue

p-values.

Note

glmTreat was previously called treatDGE in edgeR versions 3.9.10 and earlier.

Author(s)

Yunshun Chen and Gordon Smyth

References

McCarthy, D. J., and Smyth, G. K. (2009). Testing significance relative to a fold-change threshold is a TREAT. Bioinformatics 25, 765-771. doi: 10.1093/bioinformatics/btp053

See Also

topTags displays results from glmTreat.

treat is the corresponding function in the limma package, designed for use with normally distributed log-expression data rather than for negative binomial counts.

Examples

ngenes <- 100
n1 <- 3
n2 <- 3
nlibs <- n1+n2
mu <- 100
phi <- 0.1
group <- c(rep(1,n1), rep(2,n2))
design <- model.matrix(~as.factor(group))

### 4-fold change for the first 5 genes
i <- 1:5
fc <- 4
mu <- matrix(mu, ngenes, nlibs)
mu[i, 1:n1] <- mu[i, 1:n1]*fc

counts <- matrix(rnbinom(ngenes*nlibs, mu=mu, size=1/phi), ngenes, nlibs)
d <- DGEList(counts=counts,lib.size=rep(1e6, nlibs), group=group)

gfit <- glmFit(d, design, dispersion=phi)
tr <- glmTreat(gfit, coef=2, lfc=1)
topTags(tr)

OliverVoogd/edgeR documentation built on July 28, 2022, 10:13 p.m.