title: "hmdbQuery: working with Human Metabolome Database (hmdb.ca)"
author: "Vincent J. Carey, stvjc at channing.harvard.edu"
date: "r format(Sys.time(), '%B %d, %Y')
"
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{hmdbQuery: working with Human Metabolome Database (hmdb.ca)}
output:
html_document:
highlight: pygments
number_sections: yes
theme: united
toc: yes
The human metabolomics database (HMDB, http://www.hmdb.ca) includes XML documents describing 114000 metabolites. We will show how to manipulate the metadata on metabolites fairly flexibly.
```{r setu,echo=FALSE,results="hide"} suppressMessages({ suppressPackageStartupMessages({ library(hmdbQuery) library(gwascat) }) })
# Key utilities of the package
The hmdbQuery package includes a function for querying HMDB
directly over HTTP:
```{r lk1}
library(hmdbQuery)
lk1 = HmdbEntry(prefix = "http://www.hmdb.ca/metabolites/",
id = "HMDB0000001")
The result is parsed and encapsulated in an S4 object ```{r lk2} lk1
The size of the complete import of information about a single metabolite
suggests that it would not be too convenient to have comprehensive
information about all HMDB constituents in memory. The most effective
approach to managing the metadata will depend upon use cases to be
developed over the long run.
Note however that this package does provide snapshots
of certain direct associations derived from all available
information as of Sept. 23 2017. Information
about direct associations reported in the database
is present in tables `hmdb_disease`,
`hmdb_gene`, `hmdb_protein`, `hmdb_omim`. For
example
```{r lkta}
data(hmdb_disease)
hmdb_disease
Some HMDB metabolites have been mapped to diseases.
```{r lkdis3} d1 = diseases(lk1) # data.frame pmids = unlist(d1["references", 5][[1]][2,]) library(annotate) pm = pubmed(pmids[1]) ab = buildPubMedAbst(xmlRoot(pm)[[1]]) ab
## Biospecimen and tissue location metadata
Note that pre HMDB v 4.0, biospecimens were called biofluids.
There are arbitrarily many biospecimen and tissue associations
provided for each HMDB entry. We have direct accessors,
and by default we capture all metadata, available through
the `store` method.
```{r lkdee}
biospecimens(lk1)
tissues(lk1)
st = store(lk1)
head(names(st))
length(names(st))
st$protein_assoc["name",]
st$protein_assoc["gene_name",]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.