curatedMetagenomicData: Access Curated Metagenomic Data

View source: R/curatedMetagenomicData.R

curatedMetagenomicDataR Documentation

Access Curated Metagenomic Data


To access curated metagenomic data users will use curatedMetagenomicData() after "shopping" the sampleMetadata data.frame for resources they are interested in. The dryrun argument allows users to perfect a query prior to returning resources. When dryrun = TRUE, matched resources will be printed before they are returned invisibly as a character vector. When dryrun = FALSE, a list of resources containing SummarizedExperiment and/or TreeSummarizedExperiment objects, each with corresponding sample metadata, is returned. Multiple resources can be returned simultaneously and if there is more than one date corresponding to a resource, the most recent one is selected automatically. Finally, if a relative_abundance resource is requested and counts = TRUE, relative abundance proportions will be multiplied by read depth and rounded to the nearest integer.


  dryrun = TRUE,
  counts = FALSE,
  rownames = "long"



regular expression pattern to look for in the titles of resources available in curatedMetagenomicData; "" will return all resources


if TRUE (the default), a character vector of resource names is returned invisibly; if FALSE, a list of resources is returned


if FALSE (the default), relative abundance proportions are returned; if TRUE, relative abundance proportions are multiplied by read depth and rounded to the nearest integer prior to being returned


the type of rownames to use for relative_abundance resources, one of: "long" (the default), "short" (species name), or "NCBI" (NCBI Taxonomy ID)


Above "resources" refers to resources that exists in Bioconductor's ExperimentHub service. In the context of curatedMetagenomicData, these are study-level (sparse) matrix objects used to create SummarizedExperiment and/or TreeSummarizedExperiment objects that are ultimately returned as the list of resources. Only the gene_families dataType (see returnSamples) is stored as a sparse matrix in ExperimentHub – this has no practical consequences for users and is done to optimize storage. When searching for "resources", users will use the study_name value from the sampleMetadata data.frame.


if dryrun = TRUE, a character vector of resource names is returned invisibly; if dryrun = FALSE, a list of resources is returned

See Also

mergeData, returnSamples, sampleMetadata



curatedMetagenomicData("AsnicarF_2017.relative_abundance", dryrun = FALSE)

curatedMetagenomicData("AsnicarF_20.+.relative_abundance", dryrun = FALSE, counts = TRUE)

waldronlab/curatedMetagenomicData documentation built on Oct. 26, 2023, 6:32 a.m.