knitr::opts_chunk$set( collapse = TRUE, comment = "#>", crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html )
The UniChem database provides a publicly available REST API for
programmatic retrieval of mappings from standardized structural compound
identifiers to unique compound IDs across a range of large online
cheminformatic databases such as PubChem, ChEMBL, DrugBank and many more.
The service accepts POST requests to two different end-points:
/compound
and /connectivity
. Both endpoints accept query parameters via the
POST body in JSON format. The /compound
API returns exact matches for the
queried compound, while the /connectivity
API uses layers of the International
Chemical Identifier (InChI) of the query compound to return exact matches as
well as structurally related compounds such as isomers, salts, ionizations
and more.
[@UniChemBeta; @chambersUniChemUnifiedChemical2013]
The functions in AnnotationGx
have been designed to allow package users to
easily query UniChem resources without any pre-existing knowledge of
HTTP requests or the API specifications. In doing so we hope to provide an
R native interface for mapping between various cheminformatic databases,
accessible to anyone familar with using R functions!
library(AnnotationGx)
To see a table of database identifiers available via UniChem, you can call
the getUniChemSources
function.
By default, just the database shortname ("Name") and UniChem's ID for it ("SourceID") columns
are returned.
To return all columns, pass the all_columns = TRUE
argument
getUnichemSources()
When mapping using the queryUnichemCompound
function, these are the sources that can be used from,
and the databases to which the compound mappings will be returned.
The queryUnichemCompound
function allows you to query the UniChem Compound API
to retrieve mappings for a given compound identifier. The function takes two mandatory arguments.
The first is the compound
argument which is the compound identifier to be queried.
The second is the type
argument which is the type of compound identifier to search for.
Options are "uci", "inchi", "inchikey", and "sourceID".
The sourceID
argument is optional and is only required if the type
argument is "sourceID".
The function returns a list of:
data.table
containing the mapping to other Databases with the following headings:character
The compound identifiercharacter
The name of the databasecharacter
The long name of the databasecharacter
The UniChem Source IDcharacter
The URL of the sourcelist
of the following six mappings:character
The UniChem Identifiercharacter
The InChIKeycharacter
The InChIcharacter
The molecular formulacharacter
connection representation "1-6(10)13-8-5-3-2-4-7(8)9(11)12"character
hydrogen atom connections "2-5H,1H3,(H,11,12)"uci
(UniChem Identifier)Note: This type of query requires you to know the UniChem Identifier for the compound.
queryUnichemCompound(compound = "161671", type = "uci")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.