knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The MicrobiomeBenchamrkData
package provides access to a collection of
datasets with biological ground truth for benchmarking differential
abundance methods. The datasets are deposited on Zenodo:
https://doi.org/10.5281/zenodo.6911026
## Install BioConductor if not installed if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") ## Release version (not yet in Bioc, so it doesn't work yet) BiocManager::install("MicrobiomeBenchmarkData") ## Development version BiocManager::install("waldronlab/MicrobiomeBenchmarkData")
library(MicrobiomeBenchmarkData) library(purrr)
All sample metadata is merged into a single data frame and provided as a data object:
data('sampleMetadata', package = 'MicrobiomeBenchmarkData') ## Get columns present in all samples sample_metadata <- sampleMetadata |> discard(~any(is.na(.x))) |> head() knitr::kable(sample_metadata)
Currently, there are r nrow(MicrobiomeBenchmarkData::getBenchmarkData())
datasets available through the MicrobiomeBenchmarkData. These datasets are
accessed through the getBenchmarkData
function.
If no arguments are provided, the list of available datasets is printed on screen and a data.frame is returned with the description of the datasets:
dats <- getBenchmarkData()
dats
In order to import a dataset, the getBenchmarkData
function must be used with
the name of the dataset as the first argument (x
) and the dryrun
argument
set to FALSE
. The output is a list vector with the dataset imported as a
TreeSummarizedExperiment object.
tse <- getBenchmarkData('HMP_2012_16S_gingival_V35_subset', dryrun = FALSE)[[1]] tse
Several datasets can be imported simultaneously by giving the names of the different datasets in a character vector:
list_tse <- getBenchmarkData(dats$Dataset[2:4], dryrun = FALSE) str(list_tse, max.level = 1)
If all of the datasets must to be imported, this can be done by providing
the dryrun = FALSE
argument alone.
mbd <- getBenchmarkData(dryrun = FALSE) str(mbd, max.level = 1)
The biological annotations of each taxa are provided as a column in the
rowData
slot of the TreeSummarizedExperiment.
## In the case, the column is named as taxon_annotation tse <- mbd$HMP_2012_16S_gingival_V35_subset rowData(tse)
The datasets are cached so they're only downloaded once. The cache and all of
the files contained in it can be removed with the removeCache
function.
removeCache()
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.