The PubChem compound collection stores a variety of information for each molecule. These include canonical SMILES, molecular properties, substance associations, synonyms etc.
This function will extract a subset of the molecular property information for a single CID.
A single numeric CID
The method currently queries PubChem via the PUG REST interface. Since the method processes a single CID at a time, the user can parallelize processing. However, this is usually not recommended, at least in an unrestricted manner.
In addition, since the
data.frame for each CID may have a different set of physical properties, it is recommended to either extract the common set of columns or else use something like
bind_rows from the
dplyr package to get a uniform
data.frame if processing multiple CIDs
data.frame with at least 23 columns including the CID, IUPAC name, InChI and InChI key, canonical SMILES and a variety of molecular descriptors. In addition, a few physical properties are also included.
Rajarshi Guha email@example.com
1 2 3 4 5 6 7