library(knitr)

knitr::opts_chunk$set(
  collapse = TRUE
)

The purpose of this package is to help identification in metabolomics by:

  1. annotating an XCMS (or other) peaklist with the compounds from the compound databases
  2. displaying an interactive browseable peaktable with the annotated compounds in nested tables

This package is in an early alpha stage. It will need to integrate better with CompoundDb and require less "manual" data wrangling. Will also be better when the database files can be made avaialable directly.



Get compound tables from databases

We use the new package CompoundDb

library(dplyr)
library(PeakABro)
library(CompoundDb)
library(shinyBS) # for some reason explicit loading is required.
# Read the HMDB SDF file.
# You need to download this from the HMDB website.
hmdb_tab <- compound_tbl_sdf("inst/extdata/HMDB.sdf")


hmdb_meta <- make_metadata(source = "HMDB", 
                           url = "http://www.hmdb.ca",
                           source_version = "4.0", 
                           source_date = "2017-09-10",
                           organism = "Hsapiens"
                           )


hmdb_db <- createCompDb(hmdb_tab, metadata = hmdb_meta, path = tempdir())

Now we can craete our own package with this data and install it. You only need to do this once.

createCompDbPackage(
    hmdb_db, 
    version = "0.0.1", 
    author = "Jan Stanstrup", 
    path = tempdir(),
    maintainer = "Jan Stanstrup <stanstrup@gmail.com>"
)


library(devtools)
install_local(paste0(tempdir(),"/CompDb.Hsapiens.HMDB.4.0"))

Now we can load this package and get the data.

library(CompDb.Hsapiens.HMDB.4.0)

HMDB_tbl <- compounds(CompDb.Hsapiens.HMDB.4.0, return.type = "tibble")

HMDB_tbl %>% 
    slice(1:3) %>% 
    kable

This takes rather long because the databases are quite large. Therefore I try to supply pre-parsed data. So far only LipidBlast is available. This will be moved to a seperate package shortly (also there seems to be a bug in LipidBlast ATM that causes wrong info).

lipidblast_tbl <- readRDS(system.file("extdata", "lipidblast_tbl.rds", package="PeakABro"))
lipidblast_tbl %>% slice(1:3) %>% kable



Expand adducts

Lets use only HMDB for now.

cmp_tbl_exp_pos <- expand_adducts(HMDB_tbl, mode = "pos", adducts = c("[M+H]+", "[M+Na]+", "[2M+H]+", "[M+K]+", "[M+H-H2O]+"))

cmp_tbl_exp_pos %>% slice(1:3) %>% kable



Annotate peaklist

We ultimately want to use the compound table in an interactive browser so lets remove some redundant info and take only one mode.

cmp_tbl_exp_pos <- cmp_tbl_exp_pos %>% 
                   filter(mode=="pos") %>% # if you generated also neg above
                   select(-charge, -nmol, -mode)

Next we load a sample peaklist. I have removed the data columns in this sample.

library(readr)
peaklist <- read_tsv(system.file("extdata", "peaklist_pos.tsv", package="PeakABro"))

peaklist %>% slice(1:3) %>% kable

Now we can annotate the table. The idea here is that each row will have a nested table with annotations from the compound table.

library(purrr)
peaklist_anno <- peaklist %>% mutate(anno = map(mz, cmp_mz_filter, cmp_tbl_exp_pos, ppm=30))

How the peaktable looks like this:

peaklist_anno %>% select(-mzmin, -mzmax, -rtmin, -rtmax, -npeaks)

And one of the nested tables look like this:

peaklist_anno$anno[[1]] %>% slice(1:3) %>% kable



Interactive Browser

Prepare the table for interactive browser

Before we are ready to explore the peaklist interactively there are a few things we need to do and some optional things to fix:

library(tidyr)

peaklist_anno_nest <- peaklist_anno %>%
                        mutate(rt=round(rt/60,2), mz = round(mz,4)) %>% # peaklist rt in minutes and round 
                        select(mz, rt, isotopes, adduct, anno, pcgroup) %>% # get only relevant info
                        mutate(anno = map(anno,~ mutate(..1, mz = round(mz, 4), mass = round(mass, 4), ppm = round(ppm,0)))) %>% # round values in annotation tables
                        nest(-pcgroup, .key = "features") %>% # this is required! We nest the tables by the pcgroup
                        mutate(avg_rt = map_dbl(features,~round(mean(..1$rt),2))) %>% # extract average mass for each pcgroup
                        select(pcgroup, avg_rt, features) # putting the nested table last. ATM needed for the browser to work

We are almost there but first we want to add some magic to the table for the interaction (adds + button and View button).

peaklist_anno_nest_ready <- peaklist_browser_prep(peaklist_anno_nest, collapse_col = "features", modal_col = "anno")



Interactively browse peaklist

Now we can start the browser! You need to use a proper browser like chrome. Not the RStudio browser.

peaklist_browser(peaklist_anno_nest_ready, collapse_col = "features", modal_col = "anno")

Peaklist Browser



Sources and licenses



Journal References



stanstrup/PeakBro documentation built on May 29, 2019, 9:49 a.m.