getcid: Get PubChem Compound Information

Description Usage Arguments Details Value Author(s) See Also Examples

Description

The PubChem compound collection stores a variety of information for each molecule. These include canonical SMILES, molecular properties, substance associations, synonyms etc.

This function will extract a subset of the molecular property information for a single CID.

Usage

1
get.cid(cid, quiet=TRUE)

Arguments

cid

A single numeric CID

quiet

If FALSE, output is verbose

Details

The method currently queries PubChem via the PUG REST interface. Since the method processes a single CID at a time, the user can parallelize processing. However, this is usually not recommended, at least in an unrestricted manner.

In addition, since the data.frame for each CID may have a different set of physical properties, it is recommended to either extract the common set of columns or else use something like bind_rows from the dplyr package to get a uniform data.frame if processing multiple CIDs

Value

A data.frame with at least 23 columns including the CID, IUPAC name, InChI and InChI key, canonical SMILES and a variety of molecular descriptors. In addition, a few physical properties are also included. The text from the Summary Information section of the compound page page is included as an attribute of the data.frame with the name Summary.Information.

Author(s)

Rajarshi Guha rajarshi.guha@gmail.com

See Also

get.assay, get.sid, get.sid.list

Examples

1
2
3
4
5
6
7
## Not run: 
cids <- c(5282108, 5282148, 91754124)
dat <- lapply(cids, get.cid)
dat <- dplyr::bind_rows(dat)
str(dat)

## End(Not run)

CDK-R/rpubchem documentation built on Nov. 6, 2019, 3:59 a.m.