runEdgeR: Test for differential abundance: method 'treeAGG-DA-edgeR'

Description Usage Arguments Details Value Examples

Description

Test differential abundance of entities using functions from the edgeR (Robinson et al. 2010, Bioinformatics; McCarthy et al. 2012, Nucleic Acids Research) to fit models and calculate moderated test for each entity. We have used estimateDisp to estimate the dispersion. The statistical methods implemented in the edgeR package were originally designed for the analysis of gene expression data such as RNA-sequencing counts. Here, we apply these methods to counts that might be from microbes or cells.

Usage

1
2
3
runEdgeR(obj, design = NULL, contrast = NULL, normalize = TRUE,
  method = "TMM", adjust.method = "BH", prior.count = 0.125,
  use.assays = NULL)

Arguments

obj

A treeSummarizedExperiment object.

design

A numeric matrix. It must be of full column rank. Defaults to use all columns of colData to create design matrix. Note: Users should check whether the default created design matrix is exactly what they want or create their own design matrix using model.matrix.

contrast

numeric vector specifying one contrast of the linear model coefficients to be tested equal to zero. Its length must equal to the number of columns of design. If NULL, the last coefficient will be tested equal to zero.

normalize

A logical value, TRUE or FALSE. The default is TRUE.

method

Normalization method to be used. See calcNormFactors for more details.

adjust.method

A character string stating the method used to adjust p-values for multiple testing, passed on to p.adjust. It could be "bonferroni", "holm", "hochberg", "hommel", "BH", or "BY".

prior.count

average prior count to be added to observation to shrink the estimated log-fold-changes towards zero. See prior.count in glmFit

use.assays

A numeric vector. It specifies which matrix-like elements in assays will be used to do analysis.

Details

The experimental design must be specified using a design matrix. The customized design matrix could be given by design.

Normalization for samples is automatically performed by edgeR package. More details about the calculation of normalization factor could be found from calcNormFactors. A sample might include entities corresponding to leaf nodes and internal nodes of tree. Only entities corresponding to leaf nodes are used to calculate the library size of each sample. The reason is that the abundance of an entity, corresponding to an internal node, is calculated by taking sum of the abundance from its descendant leaf nodes.

Value

A treeSummarizedExperiment

assays

A list of tables

rowData

It stores the information of rows in assays, and the tables extracted from a DGELRT object that is generated by glmLRT. The later is stored as the internal part of the rowData. More details or example could be found in the vignette Example of data analysis

colData

NULL

metadata
  • use.assays which elements in the assays have been used to run differential abundance analysis.

  • design the design matrix as input.

  • contrast the contrast vector as input.

  • output_glmFit the output from glmFit. A object of DGEGLM-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
library(S4Vectors)
set.seed(1)
y <- matrix(rnbinom(300,size=1,mu=10),nrow=10)
colnames(y) <- paste(rep(LETTERS[1:3], each = 10), rep(1:10,3), sep = "_")
rownames(y) <- tinyTree$tip.label

rowInf <- DataFrame(nodeLab = rownames(y),
                    var1 = sample(letters[1:3], 10, replace = TRUE),
                    var2 = sample(c(TRUE, FALSE), 10, replace = TRUE))
colInf <- DataFrame(gg = factor(sample(1:3, 30, replace = TRUE)),
                    group = rep(LETTERS[1:3], each = 10))
toy_lse <- leafSummarizedExperiment(tree = tinyTree, rowData = rowInf,
                                    colData = colInf,
                                    assays = list(y, (2*y), 3*y))

toy_tse <- nodeValue(data = toy_lse, fun = sum, tree = tinyTree,
message = TRUE)

# build the model
contrastList <- list(contrast1 = c(0, 0, 0, -1, 1),
                     contrast2 = c(0, -1, 1, 0, 0))
mod <- runEdgeR(obj = toy_tse, contrast = contrastList)
# show results gained from the second element of the assasy
# sort by PValue
topNodes(mod, sort.by = "PValue", use.assays = 2)

markrobinsonuzh/treeAGG documentation built on May 26, 2019, 9:32 a.m.