run_test_multiple_groups: Statistical test for multiple groups
In HuaZou/MicrobiomeAnalysis: Data analysis toolkits in metagenomics

run_test_multiple_groups

R Documentation

Statistical test for multiple groups

Description

Differential expression analysis for multiple groups.

Usage

run_test_multiple_groups(
  ps,
  group,
  taxa_rank = "all",
  transform = c("identity", "log10", "log10p", "SquareRoot", "CubicRoot", "logit"),
  norm = "TSS",
  norm_para = list(),
  method = c("anova", "kruskal"),
  p_adjust = c("none", "fdr", "bonferroni", "holm", "hochberg", "hommel", "BH", "BY"),
  pvalue_cutoff = 0.05,
  effect_size_cutoff = NULL
)

Arguments

`ps`	a `phyloseq::phyloseq` object
`group`	character, the variable to set the group
`taxa_rank`	character to specify taxonomic rank to perform differential analysis on. Should be one of `phyloseq::rank_names(phyloseq)`, or "all" means to summarize the taxa by the top taxa ranks (`summarize_taxa(ps, level = rank_names(ps)[1])`), or "none" means perform differential analysis on the original taxa (`taxa_names(phyloseq)`, e.g., OTU or ASV).
`transform`	character, the methods used to transform the microbial abundance. See `transform_abundances()` for more details. The options include: "identity", return the original data without any transformation (default). "log10", the transformation is `log10(object)`, and if the data contains zeros the transformation is `log10(1 + object)`. "log10p", the transformation is `log10(1 + object)`. "SquareRoot", the transformation is `⁠Square Root⁠`. "CubicRoot", the transformation is `⁠Cubic Root⁠`. "logit", the transformation is `⁠Zero-inflated Logit Transformation⁠` (Does not work well for microbiome data).
`norm`	the methods used to normalize the microbial abundance data. See `normalize()` for more details. Options include: "none": do not normalize. "rarefy": random subsampling counts to the smallest library size in the data set. "TSS": total sum scaling, also referred to as "relative abundance", the abundances were normalized by dividing the corresponding sample library size. "TMM": trimmed mean of m-values. First, a sample is chosen as reference. The scaling factor is then derived using a weighted trimmed mean over the differences of the log-transformed gene-count fold-change between the sample and the reference. "RLE", relative log expression, RLE uses a pseudo-reference calculated using the geometric mean of the gene-specific abundances over all samples. The scaling factors are then calculated as the median of the gene counts ratios between the samples and the reference. "CSS": cumulative sum scaling, calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold. "CLR": centered log-ratio normalization. "CPM": pre-sample normalization of the sum of the values to 1e+06.
`norm_para`	arguments passed to specific normalization methods
`method`	test method, must be one of "anova" or "kruskal"
`p_adjust`	method for multiple test correction, default `none`, for more details see stats::p.adjust.
`pvalue_cutoff`	numeric, p value cutoff, default 0.05.
`effect_size_cutoff`	numeric, cutoff of effect size default `NULL` which means no effect size filter. The eta squared is used to measure the effect size for anova/kruskal test.

Value

a microbiomeMarker object.

Examples

data(enterotypes_arumugam)
ps <- phyloseq::subset_samples(
    enterotypes_arumugam,
    Enterotype %in% c("Enterotype 3", "Enterotype 2", "Enterotype 1")
)
mm_anova <- run_test_multiple_groups(
    ps,
    group = "Enterotype",
    method = "anova"
)

HuaZou/MicrobiomeAnalysis documentation built on May 13, 2024, 11:10 a.m.