microbiomeDataSets
This R package is a collection of microbiome datasets published initially elsewhere. The data is available as TreeSummarizedExperiment
or MultiAssayExperiment
and a list of available dataset can be retrieved via the availableDataSets()
function.
The microbiomeDataSets package focuses mainly on non-human studies. The independent curatedMetagenomicData package provides access to a large collection of standardized human microbiome studies in the same format.
The aim is to provide datasets for teaching, example workflows or comparative efforts. If you have a dataset, which you like to see in this package, please let us know and/or provide a PR for the datasets.
Feel free to contribute. Have a look at how existing datasets are organized and prepared data accordingly. It is also good to get in touch at the earliest convenience to discuss any issues.
Let's use a gitflow approach. Development version should be
done against the master
branch and then merged to master
for the
next release. (https://guides.github.com/introduction/flow/)
Resources on how data is added to Bioconductor's ExperimentHub backend and accessed are available from Bioconductor ExperimentHub documentation and in Creating ExperimentHub Package.
Basic steps:
Assemble a (Tree)SummarizedExperiment from the raw data
You can include the data creation script in inst/scripts/-data- (optional)
Save the individual data container components as rds files
Prepare the metadata file, by creating a new metadata-.R in inst/scripts and run the script to create inst/extdata//metadata-.csv
Make sure that the metadata files passes the check by running a script like: ExperimentHubData::makeExperimentHubMetadata("../microbiomeDataSets","3.13/metadata-hintikka-xo.csv")
Maintainer will upload the data through their AWS login. The folder structure must match the one referenced in the metadata file; for example: microbiomeDataSets//lahti-ml/coldata.rds
Follow the instructions (See Section 7)
Afterwards, the maintainer will push the new metadata to Bioconductor git repo and inform hubs@bioconductor that there is new metadata. They will let us know when the upload is done.
In the meantime, prepare a loading function as found e.g. in microbiomeDataSets::LahtiMLData has to be created and push this to biocs git repo as well.
Bump the version (note that the version scheme is different)
For questions, have a look at the other datasets or check with us through online channels
Please note that the microbiomeDataSets project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.