get_properties: Retrieve Compound Properties from PubChem

View source: R/get_properties.R

get_propertiesR Documentation

Retrieve Compound Properties from PubChem

Description

This function sends a request to PubChem to retrieve compound properties based on the specified parameters.

Usage

get_properties(
  properties = NULL,
  identifier,
  namespace = "cid",
  searchtype = NULL,
  options = NULL,
  propertyMatch = list(.ignore.case = FALSE, type = "contain")
)

property_map(
  x,
  type = c("match", "contain", "start", "end", "all"),
  .ignore.case = TRUE,
  ...
)

Arguments

properties

A character vector specifying the properties to retrieve. If NULL (default), all available properties are retrieved. Properties can be specified by exact names, partial matches, or patterns, controlled by the propertyMatch argument. For a full list of properties, see the Property Table.

identifier

A vector of compound identifiers, either numeric or character. The type of identifier depends on the namespace parameter. **Note**: identifier must be provided; it cannot be NULL.

namespace

A character string specifying the namespace of the identifier.

Possible values include:

- cid: PubChem Compound Identifier (default)

- name: Chemical name

- smiles: SMILES string

- inchi: InChI string

- inchikey: InChIKey

- formula: Molecular formula

- Other namespaces as specified in the API documentation.

searchtype

An optional character string specifying the search type.

Possible values include:

- similarity

- substructure

- superstructure

- identity

- Other search types as specified in the API documentation.

If NULL (default), no search type is specified.

For more details, see the API documentation.

options

A list of additional options for the request.

Available options depend on the specific request and the API.

Examples include:

- For similarity searches: list(Threshold = 95)

- For substructure searches: list(MaxRecords = 100)

If NULL (default), no additional options are included.

For more details, see the Structure Search Operations section of the PUG REST API.

propertyMatch

A list of arguments to control how properties are matched.

The list can include:

- type: The type of match. Possible values are exact, contain, match. Default is contain.

- .ignore.case: Logical value indicating if the match should ignore case. Default is FALSE.

- x: The properties to match (set internally; do not set manually).

Default is list(.ignore.case = FALSE, type = "contain").

x

A character vector of compound properties. The property_map function will search for each property provided here within the available properties. The search can be customized using the type argument. This argument is ignored if type = "all".

type

Defines how to search within the available properties. The default is "match". See Notes for details.

.ignore.case

A logical value. If TRUE, the pattern match ignores case letters. This argument is ignored if type = "all". The default is TRUE.

...

Other arguments. Currently, these have no effect on the function's return.

Details

For more detailed information, please refer to the PubChem PUG REST API documentation.

Value

An object of class "PubChemInstanceList" containing all the properties of the requested compounds.

Note

Property Map:

property_map() is not used to request properties directly from the PubChem database. This function is intended to list the available compound properties that can be requested from PubChem. It has flexible options to search properties from the available property list of the PubChem database. The output of property_map is used as the property input in the get_properties function. This function may be practically used to request specific properties across a range of compounds. See examples for usage.

Examples


# Isomeric SMILES of the compounds
props <- get_properties(
  properties = c("MolecularWeight", "MolecularFormula", "InChI"),
  identifier = c("aspirin", "ibuprofen", "caffeine"),
  namespace = "name"
)

# Properties for a selected compound
instance(props, "aspirin")
retrieve(props, .which = "aspirin", .slot = NULL)
retrieve(instance(props, "aspirin"), .slot = NULL)

# Combine properties of all compounds into a single data frame (or list)
retrieve(props, .combine.all = TRUE)

# Return selected properties
retrieve(props, .combine.all = TRUE,
  .slot = c("MolecularWeight", "MolecularFormula"))

# Return properties for the compounds in a range of CIDs
props <- get_properties(
  properties = c("mass", "molecular"),
  identifier = 2244:2255,
  namespace = "cid",
  propertyMatch = list(
    type = "contain"
  )
)

retrieve(props, .combine.all = TRUE, .to.data.frame = TRUE)

# Return all available properties of the requested compounds
props <- get_properties(
  properties = NULL,
  identifier = 2244:2245,
  namespace = "cid",
  propertyMatch = list(
    type = "all"
  )
)

retrieve(props, .combine.all = TRUE)



#### EXAMPLES FOR property_map() ####
# List all available properties:
property_map(type = "all")

# Exact match:
property_map("InChI", type = "match")
property_map("InChi", type = "match",
  .ignore.case = TRUE) # Returns no match. Ignores '.ignore.case'

# Match at the start/end:
property_map("molecular", type = "start", .ignore.case = TRUE)
property_map("mass", type = "end", .ignore.case = TRUE)

# Partial match with multiple search patterns:
property_map(c("molecular", "mass", "inchi"),
  type = "contain", .ignore.case = TRUE)


PubChemR documentation built on April 4, 2025, 2:18 a.m.