roast.DGEList: Self-contained Gene Set Tests for Digital Gene Expression...
In edgeR: Empirical Analysis of Digital Gene Expression Data in R

Description Usage Arguments Details Value Author(s) References See Also Examples

Rotation gene set testing for Negative Binomial generalized linear models.

## S3 method for class 'DGEList'
fry(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
      sort = "directional", ...)

## S3 method for class 'DGEList'
roast(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
      set.statistic = "mean", gene.weights = NULL, nrot = 1999, ...)

## S3 method for class 'DGEList'
mroast(y, index = NULL, design = NULL, contrast = ncol(design), geneid = NULL,
       set.statistic = "mean", gene.weights = NULL, nrot = 1999,
       adjust.method = "BH", midp = TRUE, sort = "directional", ...)

`y`	`DGEList` object.
`index`	index vector specifying which rows (probes) of `y` are in the test set. Can be a vector of integer indices, or a logical vector of length `nrow(y)`, or a vector of gene IDs corresponding to entries in `geneid`. Alternatively it can be a data.frame with the first column containing the index vector and the second column containing directional gene weights. For `mroast` or `fry`, `index` is a list of index vectors or a list of data.frames.
`design`	the design matrix. Defaults to `y$design` or, failing that, to `model.matrix(~y$samples$group)`.
`contrast`	contrast for which the test is required. Can be an integer specifying a column of `design`, or the name of a column of `design`, or a numeric contrast vector of length equal to the number of columns of `design`.
`geneid`	gene identifiers corresponding to the rows of `y`. Can be either a vector of length `nrow(y)` or the name of the column of `y$genes` containing the gene identifiers. Defaults to `rownames(y)`.
`set.statistic`	summary set statistic. Possibilities are `"mean"`,`"floormean"`,`"mean50"` or `"msq"`.
`gene.weights`	numeric vector of directional (positive or negative) genewise weights. For `mroast` or `fry`, this vector must have length equal to `nrow(y)`. For `roast`, can be of length `nrow(y)` or of length equal to the number of genes in the test set.
`nrot`	number of rotations used to compute the p-values.
`adjust.method`	method used to adjust the p-values for multiple testing. See `p.adjust` for possible values.
`midp`	logical, should mid-p-values be used in instead of ordinary p-values when adjusting for multiple testing?
`sort`	character, whether to sort output table by directional p-value (`"directional"`), non-directional p-value (`"mixed"`), or not at all (`"none"`).
`...`	other arguments are currently ignored.

These functions perform self-contained gene set tests against the null hypothesis that none of the genes in the set are differentially expressed. fry is the recommended function in the edgeR context.

The roast gene set test was proposed by Wu et al (2010) for microarray data and the roast and mroast methods documented here extend the test to digital gene expression data. The roast method uses residual space rotations instead of permutations to obtain p-values, a technique that take advantage of the full generality of linear models. The negative binomial count data is converted to approximate normal deviates by computing mid-p quantile residuals (Dunn and Smyth, 1996; Routledge, 1994) under the null hypothesis that the contrast is zero, and the normal deviates are then passed to the limma roast function. See roast for more description of the test and for a complete list of possible arguments. mroast is similar but performs roast tests for multiple of gene sets instead of just one.

The fry method documented here similarly generalizes the fry gene set test for microarray data. fry is recommended over roast or mroast for count data because, in this context, it is equivalent to mroast but with an infinite number of rotations.

roast produces an object of class Roast. See roast for details.

mroast and fry produce a data.frame. See mroast for details.

Yunshun Chen and Gordon Smyth

Dunn, PK, and Smyth, GK (1996). Randomized quantile residuals. J. Comput. Graph. Statist., 5, 236-244. http://www.statsci.org/smyth/pubs/residual.html

Routledge, RD (1994). Practicing safe statistics with the mid-p. Canadian Journal of Statistics 22, 103-110.

Wu, D, Lim, E, Francois Vaillant, F, Asselin-Labat, M-L, Visvader, JE, and Smyth, GK (2010). ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26, 2176-2182. http://bioinformatics.oxfordjournals.org/content/26/17/2176

roast, camera.DGEList

mu <- matrix(10, 100, 4)
group <- factor(c(0,0,1,1))
design <- model.matrix(~group)

# First set of 10 genes that are genuinely differentially expressed
iset1 <- 1:10
mu[iset1,3:4] <- mu[iset1,3:4]+10

# Second set of 10 genes are not DE
iset2 <- 11:20

# Generate counts and create a DGEList object
y <- matrix(rnbinom(100*4, mu=mu, size=10),100,4)
y <- DGEList(counts=y, group=group)

# Estimate dispersions
y <- estimateDisp(y, design)

roast(y, iset1, design, contrast=2)
mroast(y, iset1, design, contrast=2)
mroast(y, list(set1=iset1, set2=iset2), design, contrast=2)

Loading required package: limma
         Active.Prop   P.Value
Down             0.0 0.9904952
Up               0.3 0.0100050
UpOrDown         0.3 0.0200000
Mixed            0.3 0.0360000
     NGenes PropDown PropUp Direction PValue   FDR PValue.Mixed FDR.Mixed
set1     10        0    0.3        Up  0.021 0.021        0.042     0.042
     NGenes PropDown PropUp Direction PValue   FDR PValue.Mixed FDR.Mixed
set1     10      0.0    0.3        Up  0.025 0.049        0.028     0.028
set2     10      0.2    0.1      Down  0.642 0.642        0.009     0.017