This vignette compares annotating CTRP-provided treatment ids to PubChem CIDs and CTD information.
Whereas the PubChem CID is a unique identifier for a compound, the PubChem API does not easily map treatment names to CIDs, atleast not in a way that easy for commonly misnamed treatments. Specifically, for the CTRP treatment names (n=545), the PubChem API does not correctly map all of them to PubChem CIDs. <!-- NOTE: As of March 27, 2025, the CTD2 database is not available. The API is not available. The CTD2 database is the central database where CTRP data is hosted. They happen to expose an API for their database.
Developer Note: The API calls they describe on their API documentation is useful, but they have an endpoint:
GET /compound/{compoundId}
that is not documented. This endpoint is useful for mapping compound names in the way
their data (i.e CTRP) names them to PubChem CIDs.
The functionality for this is implemented in the mapCompound2CTD
function. -->
It is an investigation to see which of the methods might map more compounds
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(AnnotationGx) data(CTRP_treatmentMetadata)
``` {r test_both}
treatment <- CTRP_treatmentMetadata[1, CTRP.treatmentid] sprintf("CTRP treatment id : %s", treatment)
mapCompound2CID(treatment)
## Annotating using PubChem ``` {r run_CTRP_Pubchem, eval = FALSE} (compounds_to_cids <- CTRP_treatmentMetadata[1:10, AnnotationGx::mapCompound2CID( names = CTRP.treatmentid, first = TRUE ) ] ) failed <- attributes(compounds_to_cids)$failed |> names()
``` {r Pubchem Failed, eval = FALSE} failed <- unique(CTRP_treatmentMetadata[CTRP.treatmentid %in% failed, ])
failed[, CTRP.treatmentid_CLEANED := cleanCharacterStrings(CTRP.treatmentid)]
(failed_to_cids <- failed[, AnnotationGx::mapCompound2CID( names = CTRP.treatmentid_CLEANED, first = TRUE ) ] ) failed_again <- attributes(failed_to_cids)$failed |> names()
``` {r pubchemfailed again, eval = FALSE} failed_dt <- merge(failed_to_cids[!is.na(cids),], failed, by.x = "name", by.y = "CTRP.treatmentid_CLEANED", all.x = F) failed_dt$name <- NULL successful_dt <- merge(CTRP_treatmentMetadata, compounds_to_cids[!is.na(cids),],by.x = "CTRP.treatmentid", by.y = "name", all.x = F) mapped_PubChem <- data.table::rbindlist(list(successful_dt, failed_dt), use.names = T, fill = T)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.