Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/createCompDbPackage.R
createCompDb
creates a SQLite
-based CompDb
object/database
from a compound resource provided as a data.frame
or tbl
. Alternatively,
the name(s) of the file(s) from which the annotation should be extracted can
be provided. Supported are SDF files (such as those from the
Human Metabolome Database HMDB) that can be read using the
compound_tbl_sdf()
or LipidBlast files (see compound_tbl_lipidblast()
.
An additional data.frame
providing metadata information including the data
source, date, version and organism is mandatory.
Optionally MS/MS (MS2) spectra for compounds can be also stored in the
database. Currently only MS/MS spectra from HMDB are supported. These can
be downloaded in XML format from HMDB (http://www.hmdb.ca), loaded with
the msms_spectra_hmdb()
or msms_spectra_mona()
function and passed to
the function with the msms_spectra
argument. See msms_spectra_hmdb()
or
msms_spectra_mona()
for information on the expected columns and format.
Required columns for the data.frame
providing the compound information (
parameter x
) are:
compound_id
: the ID of the compound.
compound_name
: the compound's name.
inchi
: the InChI of the compound.
inchikey
: the InChI key.
formula
: the chemical formula.
mass
: the compound's mass.
"synonyms"
: additional synonyms/aliases for the compound. Should be
either a single character or a list of values for each compound.
"smiles"
: the compound's SMILES.
See e.g. compound_tbl_sdf()
or compound_tbl_lipidblast()
for functions
creating such compound tables.
The metadata data.frame
is supposed to have two columns named "name"
and
"value"
providing the following minimal information as key-value pairs
(see make_metadata
for a unitlity function to create such a 'data.frame):
"source"
: the source from which the data was retrieved (e.g. "HMDB"
).
"url"
: the url from which the original data was retrieved.
"source_version"
: the version from the original data source
(e.g. "v4"
).
"source_date"
: the date when the original data source was generated.
"organism"
: the organism. Should be in the form "Hsapiens"
or
"Mmusculus"
.
createCompDbPackage
creates an R data package with the data from a
CompDb
object.
make_metadata
helps generating a metadata data.frame
in the
correct format expected by the createCompDb
function. The function
returns a data.frame
.
1 2 3 4 5 6 7 8 9 10 11 12 | createCompDb(x, metadata, msms_spectra, path = ".")
createCompDbPackage(
x,
version,
maintainer,
author,
path = ".",
license = "Artistic-2.0"
)
make_metadata(source, url, source_version, source_date, organism)
|
x |
For For `createCompDbPackage`: `character(1)` with the file name of the `CompDb` SQLite file (created by `createCompDb`). |
metadata |
For |
msms_spectra |
For |
path |
|
version |
For |
maintainer |
For |
author |
For |
license |
For |
source |
For |
url |
For |
source_version |
For |
source_date |
For |
organism |
For |
Metadata information is also used to create the file name for the database
file. The name starts with "CompDb"
, followed by the organism, the
data source and its version. A compound database file for HMDB version 4
with human metabolites will thus be named: "CompDb.Hsapiens.HMDB.v4"
.
A single CompDb
database is created from multiple SDF files (e.g. for
PubChem) if all the file names are provided with parameter x
. Parallel
processing is currently not enabled because SQLite does not support it yet
natively.
For createCompDb
: a character(1)
with the database name
(invisibly).
Johannes Rainer
compound_tbl_sdf()
and compound_tbl_lipidblast()
for functions
to extract compound annotations from files in SDF format, or files from
LipidBlast.
import_mona_sdf()
to import both the compound and spectrum data from a
SDF file from MoNa (Massbank of North America) in one call.
msms_spectra_hmdb()
and msms_spectra_mona()
for functions to import
MS/MS spectrum data from xml files from HMDB or an SDF file from MoNa.
CompDb()
for how to use a compound database.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | ## Read compounds for a HMDB subset
fl <- system.file("sdf/HMDB_sub.sdf.gz", package = "CompoundDb")
cmps <- compound_tbl_sdf(fl)
## Create a metadata data.frame for the compounds.
metad <- data.frame(name = c("source", "url", "source_version",
"source_date", "organism"), value = c("HMDB", "http://www.hmdb.ca",
"v4", "2017-08-27", "Hsapiens"))
## Alternatively use the make_metadata helper function
metad <- make_metadata(source = "HMDB", source_version = "v4",
source_date = "2017-08", organism = "Hsapiens",
url = "http://www.hmdb.ca")
## Create a SQLite database in the temporary folder
db_f <- createCompDb(cmps, metadata = metad, path = tempdir())
## The database can be loaded and accessed with a CompDb object
db <- CompDb(db_f)
db
## Create a database for HMDB that includes also MS/MS spectrum data
metad2 <- make_metadata(source = "HMDB_with_spectra", source_version = "v4",
source_date = "2017-08", organism = "Hsapiens",
url = "http://www.hmdb.ca")
## Import spectrum information from selected MS/MS xml files from HMDB
## that are provided in the package
xml_path <- system.file("xml", package = "CompoundDb")
spctra <- msms_spectra_hmdb(xml_path)
## Create a SQLite database in the temporary folder
db_f2 <- createCompDb(cmps, metadata = metad2, msms_spectra = spctra,
path = tempdir())
## The database can be loaded and accessed with a CompDb object
db2 <- CompDb(db_f2)
db2
## Does the database contain MS/MS spectrum data?
hasMsMsSpectra(db2)
## Create a database for a ChEBI subset providing the file name of the
## corresponding SDF file
metad <- make_metadata(source = "ChEBI_sub", source_version = "2",
source_date = NA, organism = "Hsapiens", url = "www.ebi.ac.uk/chebi")
db_f <- createCompDb(system.file("sdf/ChEBI_sub.sdf.gz",
package = "CompoundDb"), metadata = metad, path = tempdir())
db <- CompDb(db_f)
db
compounds(db)
## connect to the database and query it's tables using RSQlite
library(RSQLite)
con <- dbConnect(dbDriver("SQLite"), db_f)
dbGetQuery(con, "select * from metadata")
dbGetQuery(con, "select * from compound")
## To create a CompDb R-package we could simply use the
## createCompDbPackage function on the SQLite database file name.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.