View source: R/phyloseq_average.R
phyloseq_average | R Documentation |
This function implements OTU abundance averaging following CoDa (Compositional Data Analysis) workflow.
phyloseq_average(
physeq,
avg_type = "aldex",
acomp_zero_impute = NULL,
aldex_samples = 128,
aldex_denom = "all",
group = NULL,
drop_group_zero = FALSE,
verbose = TRUE,
progress = NULL,
...
)
physeq |
A phyloseq-class object |
avg_type |
Averaging type ("aldex" for ALDEx2-based averaging , "acomp" for Aitchison CoDa approach; "arithmetic" for simple arithmetic mean) |
acomp_zero_impute |
Character ("CZM", "GBM","SQ","BL") or NULL; indicating weather to perform replacement of 0 abundance values with an estimate of the probability that the zero is not 0 (implemented only for avg_type = "acomp"; see |
aldex_samples |
The number of Monte-Carlo Dirichlet instances to generate (see |
aldex_denom |
Character ("all", "iqlr", "lvha"), indicating which features to use as the denominator for the geometric mean calculation (see |
group |
Variable name in |
drop_group_zero |
Logical; indicating weather OTUs with zero abundance withing a group of samples should be removed |
verbose |
Logical; if TRUE (default), informational messages will be shown on screen |
progress |
Name of the progress bar to use ("none" or "text"; see |
... |
Additional arguments may be passed to |
Typical OTU abundance tables in metagenomic analysis usually has different sampling effort for different samples (which is an artifact of the sequencing procedure). The total number of reads is meaningless and distance between OTU compositions is on the relative scale (e.g., OTUs with 1 and 2 reads in one sample are so far as OTUs with 10 and 20 reads in the other samples). Therefore such OTU tables represents closed compositions and requires a special treatment within Aitchison geometry framework.
With ALDEx2-based approach (avg_type = "aldex") it is possible to take into account per-OTU technical variation within each sample using Monte-Carlo instances drawn from the Dirichlet distribution (see Fernandes et al., 2013). As the result the expected average of the OTU portions will be estimated.
Zero OTU abundance could be due to the insufficient number of reads. However,
it is possible to replace the zero counts with an expected value.
Bayesian-multiplicative (BM) replacement of count zeros is implemented in
cmultRepl
function of zCompositions package.
Sevral methods are supported: geometric Bayesian multiplicative (zero_impute = "GBM"),
count zero multiplicative (zero_impute = "CZM", default), Bayes-Laplace BM (zero_impute = "BL"),
or square root BM (zero_impute = "SQ").
In case of structural zeroes in OTU abundance table (e.g., absence of OTU
within a group assumes that it is not observed due to some biological pattern
and is not caused by a detection limit) "drop_group_zero" argument may be set
to "TRUE" to avoid zero replacement.
phyloseq object with OTU relative abundance averaged over samples (all together or within a group).
Gloor GB, Macklaim JM, Pawlowsky-Glahn V and Egozcue JJ (2017) Microbiome Datasets Are Compositional: And This Is Not Optional. Front. Microbiol. 8:2224. doi: 10.3389/fmicb.2017.02224 Martin-Fernandez JA, Barcelo-Vidal C, Pawlowsky-Glahn V (2003) Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation. Mathematical Geology 35:3. doi: 10.1023/A:1023866030544 Fernandes AD, Macklaim JM, Linn TG, Reid G, Gloor GB (2013) ANOVA-Like Differential Expression (ALDEx) Analysis for Mixed Population RNA-Seq. PLOS ONE 8(7): e67019. doi: 10.1371/journal.pone.0067019
aldex.clr
, acomp
, cmultRepl
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.