roast.DGEList: Self-contained Gene Set Tests for Digital Gene Expression...

View source: R/geneset-DGEList.R

roast.DGEListR Documentation

Self-contained Gene Set Tests for Digital Gene Expression Data

Description

Rotation gene set testing for Negative Binomial generalized linear models.

Usage

## S3 method for class 'DGEList'
fry(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
      sort = "directional", ...)

## S3 method for class 'DGEList'
roast(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
      set.statistic = "mean", gene.weights = NULL, nrot = 1999, ...)

## S3 method for class 'DGEList'
mroast(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
       set.statistic = "mean", gene.weights = NULL, nrot = 1999,
       adjust.method = "BH", midp = TRUE, sort = "directional", ...)

Arguments

y

DGEList object.

index

index vector specifying which rows (probes) of y are in the test set. Can be a vector of integer indices, or a logical vector of length nrow(y), or a vector of gene IDs corresponding to entries in geneid. Alternatively it can be a data.frame with the first column containing the index vector and the second column containing directional gene weights. For mroast or fry, index is a list of index vectors or a list of data.frames.

design

the design matrix. Defaults to y$design or, failing that, to model.matrix(~y$samples$group).

contrast

contrast for which the test is required. Can be an integer specifying a column of design, or the name of a column of design, or a numeric contrast vector of length equal to the number of columns of design.

geneid

gene identifiers corresponding to the rows of y. Can be either a vector of length nrow(y) or the name of the column of y$genes containing the gene identifiers. Defaults to rownames(y).

set.statistic

summary set statistic. Possibilities are "mean","floormean","mean50" or "msq".

gene.weights

numeric vector of directional (positive or negative) genewise weights. For mroast or fry, this vector must have length equal to nrow(y). For roast, can be of length nrow(y) or of length equal to the number of genes in the test set.

nrot

number of rotations used to compute the p-values.

adjust.method

method used to adjust the p-values for multiple testing. See p.adjust for possible values.

midp

logical, should mid-p-values be used in instead of ordinary p-values when adjusting for multiple testing?

sort

character, whether to sort output table by directional p-value ("directional"), non-directional p-value ("mixed"), or not at all ("none").

...

other arguments are currently ignored.

Details

These functions perform self-contained gene set tests against the null hypothesis that none of the genes in the set are differentially expressed. fry is the recommended function in the edgeR context.

The roast gene set test was proposed by Wu et al (2010) for microarray data and the roast and mroast methods documented here extend the test to digital gene expression data. The roast method uses residual space rotations instead of permutations to obtain p-values, a technique that take advantage of the full generality of linear models. The negative binomial count data is converted to approximate normal deviates by computing mid-p quantile residuals (Dunn and Smyth, 1996; Routledge, 1994) under the null hypothesis that the contrast is zero, and the normal deviates are then passed to the limma roast function. See roast for more description of the test and for a complete list of possible arguments. mroast is similar but performs roast tests for multiple of gene sets instead of just one.

The fry method documented here similarly generalizes the fry gene set test for microarray data. fry is recommended over roast or mroast for count data because, in this context, it is equivalent to mroast but with an infinite number of rotations.

Value

roast produces an object of class Roast. See roast for details.

mroast and fry produce a data.frame. See mroast for details.

Author(s)

Yunshun Chen and Gordon Smyth

References

Dunn, PK, and Smyth, GK (1996). Randomized quantile residuals. J. Comput. Graph. Statist., 5, 236-244. http://www.statsci.org/smyth/pubs/residual.html

Routledge, RD (1994). Practicing safe statistics with the mid-p. Canadian Journal of Statistics 22, 103-110.

Wu, D, Lim, E, Francois Vaillant, F, Asselin-Labat, M-L, Visvader, JE, and Smyth, GK (2010). ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26, 2176-2182. http://bioinformatics.oxfordjournals.org/content/26/17/2176

See Also

roast, camera.DGEList

Examples

mu <- matrix(10, 100, 4)
group <- factor(c(0,0,1,1))
design <- model.matrix(~group)

# First set of 10 genes that are genuinely differentially expressed
iset1 <- 1:10
mu[iset1,3:4] <- mu[iset1,3:4]+10

# Second set of 10 genes are not DE
iset2 <- 11:20

# Generate counts and create a DGEList object
y <- matrix(rnbinom(100*4, mu=mu, size=10),100,4)
y <- DGEList(counts=y, group=group)

# Estimate dispersions
y <- estimateDisp(y, design)

roast(y, iset1, design, contrast=2)
mroast(y, iset1, design, contrast=2)
mroast(y, list(set1=iset1, set2=iset2), design, contrast=2)

OliverVoogd/edgeR documentation built on July 28, 2022, 10:13 p.m.