run_ord: Ordination for microbiota data
In HuaZou/MicrobiomeAnalysis: Data analysis toolkits in metagenomics

run_ord

R Documentation

Ordination for microbiota data

Description

The primary goal of ordination was considered “exploratory” (Gauch 1982a, b), with the introduction of canonical correspondence analysis (CCA), ordination has gone beyond mere “exploratory” analysis (ter Braak 1985) and become hypothesis testing as well.

Usage

run_ord(
    object,
    level = NULL,
    variable,
    transform = c("identity", "log10", "log10p",
                  "SquareRoot", "CubicRoot", "logit"),
    norm = c("none", "rarefy", "TSS", "TMM",
             "RLE", "CSS", "CLR", "CPM"),
    method = c("PCA", "PCoA", "tSNE", "UMAP", "NMDS",
               "CA", "RDA", "CCA", "CAP"),
    distance = c("bray", "unifrac", "wunifrac",
                 "GUniFrac", "dpcoa", "jsd"),
    para = list(Perplexity = NULL,
                Y_vars = NULL,
                Z_vars = NULL,
                scale = TRUE,
                center = TRUE,),
    ...)

Arguments

`object`	(Required). a `phyloseq::phyloseq` or `SummarizedExperiment::SummarizedExperiment` object.
`level`	(Optional). character. Summarization level (from `rank_names(pseq)`, default: NULL).
`variable`	(Required). character. grouping variable for test.
`transform`	character, the methods used to transform the microbial abundance. See `transform_abundances()` for more details. The options include: "identity", return the original data without any transformation (default). "log10", the transformation is `log10(object)`, and if the data contains zeros the transformation is `log10(1 + object)`. "log10p", the transformation is `log10(1 + object)`. "SquareRoot", the transformation is `⁠Square Root⁠`. "CubicRoot", the transformation is `⁠Cubic Root⁠`. "logit", the transformation is `⁠Zero-inflated Logit Transformation⁠` (Does not work well for microbiome data).
`norm`	the methods used to normalize the microbial abundance data. See `normalize()` for more details. Options include: "none": do not normalize. "rarefy": random subsampling counts to the smallest library size in the data set. "TMM": trimmed mean of m-values. First, a sample is chosen as reference. The scaling factor is then derived using a weighted trimmed mean over the differences of the log-transformed gene-count fold-change between the sample and the reference. "RLE", relative log expression, RLE uses a pseudo-reference calculated using the geometric mean of the gene-specific abundances over all samples. The scaling factors are then calculated as the median of the gene counts ratios between the samples and the reference. "CSS": cumulative sum scaling, calculates scaling factors as the cumulative sum of gene abundances up to a data-derived threshold.
`method`	(Optional). character. Ordination method (default: "PCoA"), options include: "PCA": Principal Component Analysis. "PCoA": Principal Coordinate Analysis. "tSNE": t-distributed stochastic neighbor embedding. "UMAP": Uniform Manifold Approximation and Projection. "NMDS": Non-metric Multidimensional Scaling.
`distance`	(Optional). character. Provide one of the currently supported options. See `vegan::vegdist` for a detailed list of the supported options and links to accompanying documentation (default: "bray"). Options include: "bray": bray crutis distance. "unifrac" : unweighted UniFrac distance. "wunifrac": weighted-UniFrac distance. "GUniFrac": The variance-adjusted weighted UniFrac distances (default: alpha=0.5). "dpcoa": sample-wise distance used in Double Principle Coordinate Analysis. "jsd": Jensen-Shannon Divergence. Alternatively, you can provide a character string that defines a custom distance method, if it has the form described in `designdist`.
`para`	(Optional). list. the additional parameters for methods. "Perplexity": numeric; Perplexity parameter (should not be bigger than 3 perplexity < nrow(X) - 1. "Y_vars": Constraining matrix, typically of environmental variables. "Z_vars": Conditioning matrix, the effect of which is removed ("partial out") before next step. "scale": Scale features to unit variance (like correlations). "center": Scale features to unit variance (like correlations).
`...`	(Optional). additional parameters.

Details

The primary aim of ordination is to represent multiple samples (subjects) in a reduced number of orthogonal (i.e., independent) axes, where the total number of axes is less than or equal to the number of samples

Value

A list of the ordination's results.

Author(s)

Created by Hua Zou (8/9/2023 Shenzhen China)

References

Xia, Y., Sun, J., & Chen, D. G. (2018). Statistical analysis of microbiome data with R (Vol. 847). Singapore: Springer.

Examples


## Not run: 

# phyloseq object
data("Zeybel_2022_gut")
ps_zeybel <- summarize_taxa(Zeybel_2022_gut, level = "Genus")
ord_result <- run_ord(
  object = ps_zeybel,
  variable = "LiverFatClass",
  method = "PCoA")

# SummarizedExperiment object
data("Zeybel_2022_protein")
Zeybel_2022_protein_imp <- impute_abundance(
  Zeybel_2022_protein,
  group = "LiverFatClass",
  ZerosAsNA = TRUE,
  RemoveNA = TRUE,
  cutoff = 20,
  method = "knn")
ord_result <- run_ord(
  object = Zeybel_2022_protein_imp,
  variable = "LiverFatClass",
  method = "PCA")


## End(Not run)

HuaZou/MicrobiomeAnalysis documentation built on May 13, 2024, 11:10 a.m.