This is a short introduction to other R packages in the field of metabarcoding analysis.

State of the Field in R

The metabarcoding ecosystem in the R language is mature, well-constructed, and relies on a very active community in both the bioconductor and cran projects. The bioconductor even creates specific task views in Metagenomics and Microbiome.

R package dada2 [@callahan2016] provides a highly cited and recommended clustering method [@pauvert2019]. dada2 also provides tools to complete the metabarcoding analysis pipeline, including chimera detection and taxonomic assignment. phyloseq [@mcmurdie2013] (https://bioconductor.org/packages/release/bioc/html/phyloseq.html) facilitate metagenomics analysis by providing a way to store data (the phyloseq class) and both graphical and statistical functions.

The phyloseq package introduces the S4 class object (class physeq), which contains (i) an OTU sample matrix, (ii) a taxonomic table, (iii) a sample metadata table, and two optional slots for (iv) a phylogenetic tree and (v) reference sequences.

Some packages already extend the phyloseq packages. For example, the microbiome package collection [@ernst2023] provides some scripts and functions for manipulating microbiome datasets.The speedyseq package [@mclaren2020] provides faster versions of phyloseq's plotting and taxonomic merging functions, some of which ([merge_samples2()] and [merge_taxa_vec()]) are integrated in MiscMetabar (thanks to Mike. R. McLaren). The phylosmith @smith2023 package already provides some functions to extend and simplify the use of the phyloseq packages.

Other packages (mia forming the microbiome package collection and MicrobiotaProcess [@xu2023]) extend a new data structure using the comprehensive Bioconductor ecosystem of the SummarizedExperiment family.

MiscMetabar enriches this R ecosystem by providing functions to (i) describe your dataset visually, (ii) transform your data, (iii) explore biological diversity (alpha, beta, and taxonomic diversity), and (iv) simplify reproducibility. MiscMetabar is designed to complement and not compete with other R packages mentioned above. For example. The mia package is recommended for studies focusing on phylogenetic trees, and phylosmith allows easy visualization of co-occurrence networks. Using the MicrobiotaProcess::as.MPSE function, most of the utilities in the MicrobiotaProcess package are available with functions from the MiscMetabar.

I do not try to reinvent the wheel and prefer to rely on existing packages and classes rather than building a new framework. MiscMetabar is based on the phyloseq class from phyloseq, the most cited package in metagenomics [@wen2023]. For a description and comparison of these integrated packages competing with phyloseq (e.g. microeco by @liu2020, EasyAmplicon by @liu2023 and MicrobiomeAnalystR by @lu2023) see @wen2023. Note that some limitations of the phyloseq packages are circumvented thanks to phylosmith [@smith2023], microViz ([@Barnett2021]) and MiscMetabar.

Some packages provide an interactive interface useful for rapid exploration and for code-beginner biologists. Animalcules [@zhao2021] and microViz [@Barnett2021] provides shiny interactive interface whereas MicrobiomeAnalystR [@lu2023] is a web-based platform.

Session information

sessionInfo()

References



adrientaudiere/MiscMetabar documentation built on July 6, 2024, 7:02 p.m.