ancom | R Documentation |
Determine taxa whose absolute abundances, per unit volume, of
the ecosystem (e.g. gut) are significantly different with changes in the
covariate of interest (e.g. group). The current version of
ancom
function implements ANCOM in cross-sectional and repeated
measurements data while allowing for covariate adjustment.
ancom(
data = NULL,
taxa_are_rows = TRUE,
assay.type = NULL,
assay_name = "counts",
rank = NULL,
tax_level = NULL,
aggregate_data = NULL,
meta_data = NULL,
p_adj_method = "holm",
prv_cut = 0.1,
lib_cut = 0,
main_var,
adj_formula = NULL,
rand_formula = NULL,
lme_control = lme4::lmerControl(),
struc_zero = FALSE,
neg_lb = FALSE,
alpha = 0.05,
n_cl = 1,
verbose = TRUE
)
data |
the input data. The |
taxa_are_rows |
logical. Whether taxa are positioned in the rows of the feature table. Default is TRUE. It is recommended to use low taxonomic levels, such as OTU or species level, as the estimation of sampling fractions requires a large number of taxa. |
assay.type |
alias for |
assay_name |
character. Name of the count table in the data object
(only applicable if data object is a |
rank |
alias for |
tax_level |
character. The taxonomic or non taxonomic(rowData) level of interest. The input data
can be analyzed at any taxonomic or rowData level without prior agglomeration.
Note that |
aggregate_data |
The abundance data that has been aggregated to the desired
taxonomic level. This parameter is required only when the input data is in
|
meta_data |
a |
p_adj_method |
character. method to adjust p-values. Default is "holm".
Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY",
"fdr", "none". See |
prv_cut |
a numerical fraction between 0 and 1. Taxa with prevalences
(the proportion of samples in which the taxon is present)
less than |
lib_cut |
a numerical threshold for filtering samples based on library
sizes. Samples with library sizes less than |
main_var |
character. The name of the main variable of interest. |
adj_formula |
character string representing the formula for
covariate adjustment. Please note that you should NOT include the
|
rand_formula |
the character string expresses how the microbial absolute
abundances for each taxon depend on the random effects in metadata. ANCOM
follows the |
lme_control |
a list of control parameters for mixed model fitting.
See |
struc_zero |
logical. whether to detect structural zeros based on
|
neg_lb |
logical. whether to classify a taxon as a structural zero using its asymptotic lower bound. Default is FALSE. |
alpha |
numeric. level of significance. Default is 0.05. |
n_cl |
numeric. The number of nodes to be forked. For details, see
|
verbose |
logical. Whether to display detailed progress messages. |
A taxon is considered to have structural zeros in some (>=1)
groups if it is completely (or nearly completely) missing in these groups.
For instance, suppose there are three groups: g1, g2, and g3.
If the counts of taxon A in g1 are 0 but nonzero in g2 and g3,
then taxon A will be considered to contain structural zeros in g1.
In this example, taxon A is declared to be differentially abundant between
g1 and g2, g1 and g3, and consequently, it is globally differentially
abundant with respect to this group variable.
Such taxa are not further analyzed using ANCOM, but the results are
summarized in the overall summary. For more details about the structural
zeros, please go to the
ANCOM-II paper.
Setting neg_lb = TRUE
indicates that you are using both criteria
stated in section 3.2 of
ANCOM-II
to detect structural zeros; otherwise, the algorithm will only use the
equation 1 in section 3.2 for declaring structural zeros. Generally, it is
recommended to set neg_lb = TRUE
when the sample size per group is
relatively large (e.g. > 30).
a list
with components:
res
, a data.frame
containing ANCOM
result for the variable specified in main_var
,
each column is:
W
, test statistics.
detected_0.9, detected_0.8, detected_0.7, detected_0.6
,
logical vectors representing whether a taxon is differentially
abundant under a series of cutoffs. For example, TRUE in
detected_0.7
means the number of ALR transformed models where
the taxon is differentially abundant with regard to the main variable
outnumbers 0.7 * (n_tax - 1)
. detected_0.7
is commonly
used. Choose detected_0.8
or detected_0.9
for more
conservative results, or choose detected_0.6
for more liberal
results.
zero_ind
, a logical data.frame
with TRUE
indicating the taxon is detected to contain structural zeros in
some specific groups.
beta_data
, a numeric matrix
containing pairwise
coefficients for the main variable of interest in ALR transformed
regression models.
p_data
, a numeric matrix
containing pairwise
p-values for the main variable of interest in ALR transformed
regression models.
q_data
, a numeric matrix
containing adjusted
p-values by applying the p_adj_method
to the p_data
matrix.
Huang Lin
mandal2015analysisANCOMBC
\insertRefkaul2017analysisANCOMBC
ancombc
ancombc2
library(ANCOMBC)
if (requireNamespace("microbiome", quietly = TRUE)) {
data(atlas1006, package = "microbiome")
# subset to baseline
pseq = phyloseq::subset_samples(atlas1006, time == 0)
# run ancom function
set.seed(123)
out = ancom(data = pseq, tax_level = "Family",
p_adj_method = "holm", prv_cut = 0.10, lib_cut = 1000,
main_var = "bmi_group", adj_formula = "age + nationality",
rand_formula = NULL, lme_control = NULL,
struc_zero = TRUE, neg_lb = TRUE, alpha = 0.05, n_cl = 1)
res = out$res
} else {
message("The 'microbiome' package is not installed. Please install it to use this example.")
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.