ancombc: Differential abundance (DA) analysis for microbial absolute...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/ancombc.R

Description

Determine taxa whose absolute abundances, per unit volume, of the ecosystem (e.g. gut) are significantly different with changes in the covariate of interest (e.g. the group effect). The current version of ancombc function implements Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC) in cross-sectional data while allowing the adjustment of covariates.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
ancombc(
  phyloseq,
  formula,
  p_adj_method = "holm",
  zero_cut = 0.9,
  lib_cut = 0,
  group = NULL,
  struc_zero = FALSE,
  neg_lb = FALSE,
  tol = 1e-05,
  max_iter = 100,
  conserve = FALSE,
  alpha = 0.05,
  global = FALSE
)

Arguments

phyloseq

a phyloseq-class object, which consists of a feature table (microbial observed abundance table), a sample metadata, a taxonomy table (optional), and a phylogenetic tree (optional). The row names of the metadata must match the sample names of the feature table, and the row names of the taxonomy table must match the taxon (feature) names of the feature table. See phyloseq for more details.

formula

the character string expresses how the microbial absolute abundances for each taxon depend on the variables in metadata.

p_adj_method

method to adjust p-values by. Default is "holm". Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". See p.adjust for more details.

zero_cut

a numerical fraction between 0 and 1. Taxa with proportion of zeroes greater than zero_cut will be excluded in the analysis. Default is 0.90.

lib_cut

a numerical threshold for filtering samples based on library sizes. Samples with library sizes less than lib_cut will be excluded in the analysis. Default is 0, i.e. do not filter any sample.

group

the name of the group variable in metadata. Specifying group is required for detecting structural zeros and performing global test.

struc_zero

whether to detect structural zeros. Default is FALSE.

neg_lb

whether to classify a taxon as a structural zero in the corresponding study group using its asymptotic lower bound. Default is FALSE.

tol

the iteration convergence tolerance for the E-M algorithm. Default is 1e-05.

max_iter

the maximum number of iterations for the E-M algorithm. Default is 100.

conserve

whether to use a conservative variance estimate of the test statistic. It is recommended if the sample size is small and/or the number of differentially abundant taxa is believed to be large. Default is FALSE.

alpha

level of significance. Default is 0.05.

global

whether to perform global test. Default is FALSE.

Details

The definition of structural zero can be found at ANCOM-II. Setting neg_lb = TRUE indicates that you are using both criteria stated in section 3.2 of ANCOM-II to detect structural zeros; otherwise, the algorithm will only use the equation 1 in section 3.2 for declaring structural zeros. Generally, it is recommended to set neg_lb = TRUE when the sample size per group is relatively large (e.g. > 30).

Value

a list with components:

Author(s)

Huang Lin

References

\insertRef

kaul2017analysisANCOMBC

\insertRef

lin2020analysisANCOMBC

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#================Build a Phyloseq-Class Object from Scratch==================
library(phyloseq)

otu_mat = matrix(sample(1:100, 100, replace = TRUE), nrow = 10, ncol = 10)
rownames(otu_mat) = paste0("taxon", 1:nrow(otu_mat))
colnames(otu_mat) = paste0("sample", 1:ncol(otu_mat))


meta = data.frame(group = sample(LETTERS[1:4], size = 10, replace = TRUE),
                  row.names = paste0("sample", 1:ncol(otu_mat)),
                  stringsAsFactors = FALSE)

tax_mat = matrix(sample(letters, 70, replace = TRUE),
                 nrow = nrow(otu_mat), ncol = 7)
rownames(tax_mat) = rownames(otu_mat)
colnames(tax_mat) = c("Kingdom", "Phylum", "Class", "Order",
                      "Family", "Genus", "Species")

OTU = otu_table(otu_mat, taxa_are_rows = TRUE)
META = sample_data(meta)
TAX = tax_table(tax_mat)
physeq = phyloseq(OTU, META, TAX)

#========================Run ANCOMBC Using a Real Data=======================

library(phyloseq)
library(microbiome)
library(tidyverse)
data(GlobalPatterns)

# Aggregate to phylum level
phylum_data = aggregate_taxa(GlobalPatterns, "Phylum")
# The taxonomy table
tax_mat = as(tax_table(phylum_data), "matrix")

# Run ancombc function
out = ancombc(phyloseq = phylum_data, formula = "SampleType",
              p_adj_method = "holm", zero_cut = 0.90, lib_cut = 1000,
              group = "SampleType", struc_zero = TRUE, neg_lb = FALSE,
              tol = 1e-5, max_iter = 100, conserve = TRUE,
              alpha = 0.05, global = TRUE)

res = out$res
res_global = out$res_global

ANCOMBC documentation built on March 11, 2021, 2 a.m.