BiocStyle::markdown()
Package: r Biocpkg("MsBackendHmdb")
Authors: r packageDescription("MsBackendHmdb")[["Author"]]
Last modified: r file.info("MsBackendHmdb.Rmd")$mtime
Compiled: r date()
library(Spectra) library(BiocStyle)
The Spectra
package provides a central infrastructure for the handling of Mass
Spectrometry (MS) data. The package supports interchangeable use of different
backends to import MS data from a variety of sources (such as mzML files). The
MsBackendHmdb
package enables, with the MsBackendHmdbXml
object, import of
MS/MS spectrum data from xml files from The Human Metabolome Database
HMDB. This vignette illustrates the usage of the
MsBackendHmdb
package to enable HMDB data usage in Spectra
.
Spectral data from HMDB can be downloaded in xml format, one xml file per spectrum. In our short example we load 4 such xml files which are provided with this package. Below we first load all required packages and define the file names of the MS/MS spectra xml files.
library(Spectra) library(MsBackendHmdb) fls <- dir(system.file("xml", package = "MsBackendHmdb"), full.names = TRUE, pattern = "xml$")
MS data can be accessed and analyzed through Spectra
objects. Below we create
a Spectra
with the data from the above xml files. To this end we provide the
file names and specify to use a MsBackendHmdbXml()
backend as source to
enable data import and MsBackendDataFrame()
as backend to store/handle the
data.
sps <- Spectra(fls, source = MsBackendHmdbXml(), backend = MsBackendDataFrame())
With that we have now full access to all imported spectra variables that we list below.
spectraVariables(sps)
Besides default spectra variables, such as msLevel
, rtime
, precursorMz
(most of which are however not defined in the xml files from HMDB) we have also
additional spectra variables such as the spectrum_id
(the ID of the spectrum
in the HMDB database), compound_id
(the metabolite identifier), splash
or
instrument_type
. Below we list the instrument type for the 4 spectra.
sps$instrument_type
The last spectrum was predicted and the instrument type is thus set to NA
.
In addition we can also access the m/z and intensity values of each spectrum.
mz(sps) intensity(sps)
It is also possible to import all MS/MS spectra from HMDB into a Spectra
object. For this we have to firstly download all xml files from the
downloads page of HMDB. HMDB allows to download all files in a single archive,
unzipping that will result in a folder with a very large number of small files,
one file per spectrum. Note also that this folder will contain also NMR and
other types of spectra. The variable path
below is supposed to point to the
folder where all xml files can be found. We first list all xml files in that
folder using the pattern "ms_ms_spectrum"
to get only file names for MS/MS
spectra.
path <- "~/data/hmdb_all_spectra/" fls <- dir(path, pattern = "ms_ms_spectrum", full.names = TRUE)
With that we can now import the data and create a Spectra
object representing
the collection of all HMDB MS2 spectra. Setting nonStop = TRUE
prevents the
call to stop whenever it encounters problematic xml files (like xml files
without peaks). Note that the import of about 400,000 MS/MS spectra can take a
long time (in the range of one to several hours).
sps_hmdb <- Spectra(fls, source = MsBackendHmdbXml(), nonStop = TRUE, backend = MsBackendDataFrame())
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.