DA_ALDEx2: DA_ALDEx2

View source: R/DA_ALDEx2.R

DA_ALDEx2R Documentation

DA_ALDEx2

Description

Fast run for the ALDEx2's differential abundance detection method. Support for Welch's t, Wilcoxon, Kruskal-Wallace, Kruskal-Wallace glm ANOVA-like, and glm tests.

Usage

DA_ALDEx2(
  object,
  assay_name = "counts",
  pseudo_count = FALSE,
  design = NULL,
  mc.samples = 128,
  test = c("t", "wilcox", "kw", "kw_glm", "glm"),
  paired.test = FALSE,
  denom = "all",
  contrast = NULL,
  verbose = TRUE
)

Arguments

object

a phyloseq or TreeSummarizedExperiment object.

assay_name

the name of the assay to extract from the TreeSummarizedExperiment object (default assayName = "counts"). Not used if the input object is a phyloseq.

pseudo_count

add 1 to all counts if TRUE (default pseudo_count = FALSE).

design

a character with the name of a variable to group samples and compare them or a formula to compute a model.matrix (when test = "glm").

mc.samples

an integer. The number of Monte Carlo samples to use when estimating the underlying distributions. Since we are estimating central tendencies, 128 is usually sufficient.

test

a character string. Indicates which tests to perform. "t" runs Welch's t test while "wilcox" runs Wilcoxon test. "kw" runs Kruskal-Wallace test while "kw_glm" runs glm ANOVA-like test. "glm" runs a generalized linear model.

paired.test

A boolean. Toggles whether to do paired-sample tests. Applies to effect = TRUE and test = "t".

denom

An any variable (all, iqlr, zero, lvha, median, user) indicating features to use as the denominator for the Geometric Mean calculation The default "all" uses the geometric mean abundance of all features. Using "median" returns the median abundance of all features. Using "iqlr" uses the features that are between the first and third quartile of the variance of the clr values across all samples. Using "zero" uses the non-zero features in each grop as the denominator. This approach is an extreme case where there are many nonzero features in one condition but many zeros in another. Using "lvha" uses features that have low variance (bottom quartile) and high relative abundance (top quartile in every sample). It is also possible to supply a vector of row indices to use as the denominator. Here, the experimentalist is determining a-priori which rows are thought to be invariant. In the case of RNA-seq, this could include ribosomal protein genes and and other house-keeping genes. This should be used with caution because the offsets may be different in the original data and in the data used by the function because features that are 0 in all samples are removed by aldex.clr.

contrast

character vector with exactly three elements: the name of a variable used in "design", the name of the level of interest, and the name of the reference level. If "kw" or "kw_glm" as test, contrast vector is not used.

verbose

an optional logical value. If TRUE, information about the steps of the algorithm is printed. Default verbose = TRUE.

Value

A list object containing the matrix of p-values 'pValMat', the matrix of summary statistics for each tag 'statInfo', and a suggested 'name' of the final object considering the parameters passed to the function.

See Also

aldex for the Dirichlet-Multinomial model estimation. Several and more complex tests are present in the ALDEx2 framework.

Examples

set.seed(1)
# Create a very simple phyloseq object
counts <- matrix(rnbinom(n = 300, size = 3, prob = 0.5), nrow = 50, ncol = 6)
metadata <- data.frame("Sample" = c("S1", "S2", "S3", "S4", "S5", "S6"),
                       "group" = as.factor(c("A", "A", "A", "B", "B", "B")))
ps <- phyloseq::phyloseq(phyloseq::otu_table(counts, taxa_are_rows = TRUE),
                         phyloseq::sample_data(metadata))
# Differential abundance with t test and denom defined by the user
DA_t <- DA_ALDEx2(ps, design = "group", test = "t", denom = c(1,2,3), 
    paired.test = FALSE, contrast = c("group", "B", "A"))
# Differential abundance with wilcox test and denom = "iqlr"
DA_w <- DA_ALDEx2(ps, design = "group", test = "wilcox", denom = "iqlr", 
    paired.test = FALSE, contrast = c("group", "B", "A"))
# Differential abundance with kw test and denom = "zero"
# mc.samples = 2 to speed up (128 suggested)
DA_kw <- DA_ALDEx2(ps, design = "group", test = "kw", denom = "zero", 
    mc.samples = 2)
# Differential abundance with kw_glm test and denom = "median"
DA_kw_glm <- DA_ALDEx2(ps, design = "group", test = "kw", denom = "median", 
    mc.samples = 2)
# Differential abundance with glm test and denom = "all"
DA_glm <- DA_ALDEx2(ps, design = ~ group, test = "glm", denom = "all", 
    mc.samples = 2, contrast = c("group", "B", "A"))

mcalgaro93/benchdamic documentation built on Nov. 28, 2024, 2:16 p.m.