| get_cid | R Documentation |
Retrieve compound IDs (CIDs) from PubChem.
get_cid(
query,
from = "name",
domain = c("compound", "substance", "assay"),
match = c("all", "first", "ask", "na"),
verbose = getOption("verbose"),
arg = NULL,
first = NULL,
...
)
query |
character; search term, one or more compounds. |
from |
character; type of input. See details for more information. |
domain |
character; query domain, can be one of |
match |
character; How should multiple hits be handled?, |
verbose |
logical; should a verbose output be printed on the console? |
arg |
character; optinal arguments like "name_type=word" to match individual words. |
first |
deprecated. Use 'match' instead. |
... |
currently unused. |
Valid values for the from argument depend on the
domain:
compound: "name", "smiles", "inchi",
"inchikey", "formula", "sdf", "cas" (an alias for
"xref/RN"), <xref>, <structure search>, <fast search>.
substance: "name", "sid",
<xref>, "sourceid/<source id>" or "sourceall".
assay: "aid", <assay target>.
<structure search> is assembled as "(substructure |
superstructure | similarity | identity) / (smiles
| inchi | sdf | cid)", e.g.
from = "substructure/smiles".
<xref> is assembled as "xref/(RegistryID |
RN | PubMedID | MMDBID | ProteinGI,
NucleotideGI | TaxonomyID | MIMID | GeneID |
ProbeID | PatentID)", e.g. from = "xref/RN" will query
by CAS RN.
<fast search> is either fastformula or it is assembled as
"(fastidentity | fastsimilarity_2d | fastsimilarity_3d |
fastsubstructure | fastsuperstructure)/(smiles |
smarts | inchi | sdf | cid)", e.g.
from = "fastidentity/smiles".
<source id> is any valid PubChem Data Source ID. When
from = "sourceid/<source id>", the query is the ID of the substance in
the depositor's database.
If from = "sourceall" the query is one or more valid Pubchem
depositor names. Depositor names are not case sensitive.
Depositor names and Data Source IDs can be found at https://pubchem.ncbi.nlm.nih.gov/sources/.
<assay target> is assembled as "target/(gi |
proteinname | geneid | genesymbol | accession)",
e.g. from = "target/geneid" will query by GeneID.
a tibble.
Please respect the Terms and Conditions of the National Library of Medicine, https://www.nlm.nih.gov/databases/download.html the data usage policies of National Center for Biotechnology Information, https://www.ncbi.nlm.nih.gov/home/about/policies/, https://pubchem.ncbi.nlm.nih.gov/docs/programmatic-access, and the data usage policies of the indicidual data sources https://pubchem.ncbi.nlm.nih.gov/sources/.
Wang, Y., J. Xiao, T. O. Suzek, et al. 2009 PubChem: A Public Information System for Analyzing Bioactivities of Small Molecules. Nucleic Acids Research 37: 623–633.
Kim, Sunghwan, Paul A. Thiessen, Evan E. Bolton, et al. 2016 PubChem Substance and Compound Databases. Nucleic Acids Research 44(D1): D1202–D1213.
Kim, S., Thiessen, P. A., Bolton, E. E., & Bryant, S. H. (2015). PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic acids research, gkv396.
Eduard Szöcs, Tamás Stirling, Eric R. Scott, Andreas Scharmüller, Ralf B. Schäfer (2020). webchem: An R Package to Retrieve Chemical Information from the Web. Journal of Statistical Software, 93(13). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v093.i13")}.
## Not run:
# might fail if API is not available
get_cid("Triclosan")
get_cid("Triclosan", arg = "name_type=word")
# from SMILES
get_cid("CCCC", from = "smiles")
# from InChI
get_cid("InChI=1S/CH5N/c1-2/h2H2,1H3", from = "inchi")
# from InChIKey
get_cid("BPGDAMSIGCZZLK-UHFFFAOYSA-N", from = "inchikey")
# from formula
get_cid("C26H52NO6P", from = "formula")
# from CAS RN
get_cid("56-40-6", from = "xref/rn")
# similarity
get_cid(5564, from = "similarity/cid")
get_cid("CCO", from = "similarity/smiles")
# from SID
get_cid("126534046", from = "sid", domain = "substance")
# sourceid
get_cid("VCC957895", from = "sourceid/23706", domain = "substance")
# sourceall
get_cid("Optopharma Ltd", from = "sourceall", domain = "substance")
# from AID (CIDs of substances tested in the assay)
get_cid(170004, from = "aid", domain = "assay")
# from GeneID (CIDs of substances tested on the gene)
get_cid(25086, from = "target/geneid", domain = "assay")
# multiple inputs
get_cid(c("Triclosan", "Aspirin"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.