CompDb objects provide access to general (metabolite) compound
annotations along with metadata information such as the annotation's
source, date and release version. The data is stored internally in a
database (usually an SQLite database).
TRUE if MS/MS spectrum data is
available in the database and
CompDb(x, flags = SQLITE_RO) hasMsMsSpectra(x) src_compdb(x) tables(x) copyCompDb(x, y) ## S4 method for signature 'CompDb' dbconn(x) ## S4 method for signature 'CompDb' Spectra(object, filter, ...) ## S4 method for signature 'CompDb' supportedFilters(object) ## S4 method for signature 'CompDb' metadata(x, ...) ## S4 method for signature 'CompDb' spectraVariables(object, ...) ## S4 method for signature 'CompDb' compoundVariables(object, includeId = FALSE, ...) ## S4 method for signature 'CompDb' compounds( object, columns = compoundVariables(object), filter, return.type = c("data.frame", "tibble"), ... ) ## S4 method for signature 'CompDb,Spectra' insertSpectra(object, spectra, columns = spectraVariables(spectra), ...) ## S4 method for signature 'CompDb' deleteSpectra(object, ids = integer(0), ...) ## S4 method for signature 'CompDb' mass2mz(x, adduct = c("[M+H]+"), name = "formula") ## S4 method for signature 'CompDb' insertCompound(object, compounds = data.frame(), addColumns = FALSE) ## S4 method for signature 'CompDb' deleteCompound(object, ids = character(), recursive = FALSE, ...)
For all other methods: a `CompDb` object.
flags passed to the SQLite database connection.
For all methods: a
additional arguments. Currently not used.
CompDb objects should be created using the constructor function
CompDb providing the name of the (SQLite) database file providing
the compound annotation data.
See description of the respective function.
Annotations/compound informations can be retrieved from a
compounds extracts compound data from the
CompDb object. In contrast
src_compdb it returns the actual data as a
return.type = "data.frame") or a
return.type = "tibble"). A
compounds call will always return all
elements from the ms_compound table (unless a
filter is used).
Spectra extract spectra from the database and returns them as a
Spectra() object from the
Spectra package. Additional annotations
requested with the
columns parameter are added as additional spectra
CompDb: connect to a compound database.
compoundVariables: returns all available columns/database fields for
copyCompDb: allows to copy the content from a CompDb to another database.
x is supposed to be either a
CompDb or a database connection
from which the data should be copied and
y a connection to a database
to which it should be copied.
dbconn: returns the connection (of type
DBIConnection) to the database.
metadata: returns general meta data of the compound database.
spectraVariables: returns all spectra variables (i.e. columns) available
src_compdb provides access to the
CompDb's database via
the functionality from the
supportedFilters: provides an overview of the filters that can be
applied on a
CompDb object to extract only specific data from the
tables: returns a named
list (names being table names) with
the fields/columns from each table in the database.
mass2mz: calculates a table of the m/z values for each compound based on
the provided set of adduct(s). Adduct definitions can be provided with
MetaboCoreUtils::mass2mz() for more details.
name defines the database table column that should be used as
rownames of the returned
matrix. By default
name = "formula", m/z
values are calculated for each unique formula in the
Note that inserting and deleting data requires read-write access to the
database. Databases returned by
CompDb are by default read-only. To get
CompDb should be called with parameter
flags = RSQLite::SQLITE_RW.
insertCompound: adds additional compound(s) to a
compound(s) to be added can be specified with parameter
is expected to be a
data.frame with columns
"exactmass" is expected to contain numeric values, all other
character. Missing values are allowed for all columns except
"compound_id". An optional column
"synonyms" can be used to provide
alternative names for the compound. This column can contain a single
character by row, or a
list with multiple
character (names) per
row/compound (see examples below for details). By setting parameter
addColumns = TRUE any additional columns in
compound will be added to
the database table. The default is
addColumns = FALSE. The function
CompDb with the compounds added.
createCompDb() for more information and details on expected
compound data and the examples below for general usage.
deleteCompound: removes specified compounds from the
The IDs of the compounds that should be deleted need to be provided with
ids. To include compound IDs in the output of a
"compound_id" should be added to the
columns parameter. By
default an error is thrown if for some of the specified compounds also MS2
spectra are present in the database. To force deletion of the compounds
along with all associated MS2 spectra use
recursive = TRUE. See examples
below for details. The function returns the updated
insertSpectra: adds further spectra to the database.
The method always adds all the spectra specified through the
parameter and does not check if they are already in the database. Note that
the input spectra must have the variable
compound_id and only
compound_id values are also in
can be added. Parameter
columns defines which spectra variables from the
spectra should be inserted into the database. By default, all spectra
variables are added but it is strongly suggested to specifically select
(meaningful) spectra variables that should be stored in the database.
Note that a spectra variable
"compound_id" is mandatory.
If needed, the function adds additional columns to the
database table. The function returns the updated
deleteSpectra: deletes specified spectra from the database. The IDs of
the spectra to be deleted need to be provided with parameter
Data access methods such as
Spectra allow to filter the
results using specific filter classes and expressions. Filtering uses the
concepts from Bioconductor's
AnnotationFilter package. All information
for a certain compound with the ID
"HMDB0000001" can for example be
retrieved by passing the filter expression
filter = ~ compound_id == "HMDB0000001" to the
Use the supportedFilters function on the CompDb object to get a list of all supported filters. See also examples below or the usage vignette for details.
createCompDb() for the function to create a SQLite compound database.
CompoundIdFilter() for filters that can be used on the
## We load a small compound test database based on MassBank which is ## distributed with this package. cdb <- CompDb(system.file("sql/CompDb.MassBank.sql", package = "CompoundDb")) cdb ## Get general metadata information from the database, such as originating ## source and version: metadata(cdb) ## List all available compound annotations/fields compoundVariables(cdb) ## Extract a data.frame with these annotations for all compounds compounds(cdb) ## Note that the `compounds` function will by default always return a ## data frame of **unique** entries for the specified columns. Including ## also the `"compound_id"` to the requested columns will ensure that all ## data is returned from the tables. compounds(cdb, columns = c("compound_id", compoundVariables(cdb))) ## Add also the synonyms (aliases) for the compounds. This will cause the ## tables compound and synonym to be joined. The elements of the compound_id ## and name are now no longer unique res <- compounds(cdb, columns = c("name", "synonym")) head(res) ## List all database tables and their columns tables(cdb) ## Any of these columns can be used in the `compounds` call to retrieve ## the specific annotations. The corresponding database tables will then be ## joined together compounds(cdb, columns = c("formula", "publication")) ## Calculating m/z values for the exact masses of unique chemical formulas ## in the database: mass2mz(cdb, adduct = c("[M+H]+", "[M+Na]+")) ## By using `name = "compound_id"` the calculation will be performed for ## each unique compound ID instead (resulting in potentially redundant ## results) mass2mz(cdb, adduct = c("[M+H]+", "[M+Na]+"), name = "compound_id") ## Create a Spectra object with all MS/MS spectra from the database. library(Spectra) sps <- Spectra(cdb) sps ## Extract spectra for a specific compound. sps <- Spectra(cdb, filter = ~ name == "Mellein") sps ## List all available annotations for MS/MS spectra spectraVariables(sps) ## Get access to the m/z values of these mz(sps) library(Spectra) ## Plot the first spectrum plotSpectra(sps) ######### ## Filtering the database ## ## Get all compounds with an exact mass between 310 and 320 res <- compounds(cdb, filter = ~ exactmass > 310 & exactmass < 320) res ## Get all compounds that have an H14 in their formula. res <- compounds(cdb, filter = FormulaFilter("H14", "contains")) res ######### ## Using CompDb with the *tidyverse* ## ## Using return.type = "tibble" the result will be returned as a "tibble" compounds(cdb, return.type = "tibble") ## Use the CompDb in a dplyr setup library(dplyr) src_cmp <- src_compdb(cdb) src_cmp ## Get a tbl for the ms_compound table cmp_tbl <- tbl(src_cmp, "ms_compound") ## Extract the id, name and inchi cmp_tbl %>% select(compound_id, name, inchi) %>% collect() ######## ## Creating an empty CompDb and sequentially adding content ## ## Create an empty CompDb and store the database in a temporary file cdb <- emptyCompDb(tempfile()) cdb ## Define a data.frame with some compounds to add cmp <- data.frame( compound_id = c(1, 2), name = c("Caffeine", "Glucose"), formula = c("C8H10N4O2", "C6H12O6"), exactmass = c(194.080375584, 180.063388116)) ## We can also add multiple synonyms for each compound cmp$synonyms <- list(c("Cafeina", "Koffein"), "D Glucose") cmp ## These compounds can be added to the empty database with insertCompound cdb <- insertCompound(cdb, compounds = cmp) compounds(cdb) ## insertCompound would also allow to add additional columns/annotations to ## the database. Below we define a new compound adding an additional column ## hmdb_id cmp <- data.frame( compound_id = 3, name = "Alpha-Lactose", formula = "C12H22O11", exactmass = 342.116211546, hmdb_id = "HMDB0000186") ## To add additional columns we need to set addColumns = TRUE cdb <- insertCompound(cdb, compounds = cmp, addColumns = TRUE) cdb compounds(cdb) ###### ## Deleting selected compounds from a database ## ## Compounds can be deleted with the deleteCompound function providing the ## IDs of the compounds that should be deleted. IDs of compounds in the ## database can be retrieved by adding "compound_id" to the columns parameter ## of the compounds function: compounds(cdb, columns = c("compound_id", "name")) ## Compounds can be deleted with the deleteCompound function. Below we delete ## the compounds with the IDs "1" and "3" from the database cdb <- deleteCompound(cdb, ids = c("1", "3")) compounds(cdb) ## If also MS2 spectra associated with any of these two compounds an error ## would be thrown. Setting the parameter `recursive = TRUE` in the ## `deleteCompound` call would delete the compounds along with their MS2 ## spectra.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.