get_cheminfo | R Documentation |
This function lists information on all the chemicals within HTTK for which
there are sufficient data for the specified model and species.
By default the function returns only CAS (that is, info="CAS").
The type of information available includes chemical identifiers
("Compound", "CAS", "DTXSID"), in vitro
measurements ("Clint", "Clint.pvalue", "Funbound plasma", "Rblood2plasma"),
and physico-chemical information ("Formula", "logMA", "logP", "MW",
"pKa_Accept", "pKa_Donor"). The argument "info" can be a single type of
information, "all" information, or a vector of specific types of information.
The argument "model" defaults to
"3compartmentss" and the argument "species" defaults to "human".
Since different models have different
requirements and not all chemicals have complete data, this function will
return different numbers of chemicals depending on the model specified. If
a chemical is not listed by get_cheminfo then either the in vitro or
physico-chemical data needed are currently missing (but could potentially
be added using add_chemtable
.
get_cheminfo(
info = "CAS",
species = "Human",
fup.lod.default = 0.005,
model = "3compartmentss",
default.to.human = FALSE,
median.only = FALSE,
fup.ci.cutoff = TRUE,
clint.pvalue.threshold = 0.05,
physchem.exclude = TRUE,
class.exclude = TRUE,
suppress.messages = FALSE
)
info |
A single character vector (or collection of character vectors) from "Compound", "CAS", "DTXSID, "logP", "pKa_Donor"," pKa_Accept", "MW", "Clint", "Clint.pValue", "Funbound.plasma","Structure_Formula", or "Substance_Type". info="all" gives all information for the model and species. |
species |
Species desired (either "Rat", "Rabbit", "Dog", "Mouse", or default "Human"). |
fup.lod.default |
Default value used for fraction of unbound plasma for chemicals where measured value was below the limit of detection. Default value is 0.0005. |
model |
Model used in calculation, 'pbtk' for the multiple compartment model, '1compartment' for the one compartment model, '3compartment' for three compartment model, '3compartmentss' for the three compartment model without partition coefficients, or 'schmitt' for chemicals with logP and fraction unbound (used in predict_partitioning_schmitt). |
default.to.human |
Substitutes missing values with human values if true. |
median.only |
Use median values only for fup and clint. Default is FALSE. |
fup.ci.cutoff |
Cutoff for the level of uncertainty in fup estimates. This value should be between (0,1). Default is 'NULL' specifying no filtering. |
clint.pvalue.threshold |
Hepatic clearance for chemicals where the in vitro clearance assay result has a p-values greater than the threshold are set to zero. |
physchem.exclude |
Exclude chemicals on the basis of physico-chemical properties (currently only Henry's law constant) as specified by the relevant modelinfo_[MODEL] file (default TRUE). |
class.exclude |
Exclude chemical classes identified as outside of domain of applicability by the relevant modelinfo_[MODEL] file (default TRUE). |
suppress.messages |
Whether or not the output messages are suppressed (default FALSE). |
When default.to.human is set to TRUE, and the species-specific data,
Funbound.plasma and Clint, are missing from
chem.physical_and_invitro.data
, human values are given instead.
In some cases the rapid equilibrium dialysis method (Waters et al., 2008) fails to yield detectable concentrations for the free fraction of chemical. In those cases we assume the compound is highly bound (that is, Fup approaches zero). For some calculations (for example, steady-state plasma concentration) there is precedent (Rotroff et al., 2010) for using half the average limit of detection, that is, 0.005 (this value is configurable via the argument fup.lod.default). We do not recommend using other models where quantities like partition coefficients must be predicted using Fup. We also do not recommend including the value 0.005 in training sets for Fup predictive models.
Note that in some cases the Funbound.plasma (fup) and the intrinsic clearance (clint) are provided as a series of numbers separated by commas. These values are the result of Bayesian analysis and characterize a distribution: the first value is the median of the distribution, while the second and third values are the lower and upper 95th percentile (that is quantile 2.5 and 97.5) respectively. For intrinsic clearance a fourth value indicating a p-value for a decrease is provided. Typically 4000 samples were used for the Bayesian analysis, such that a p-value of "0" is equivalent to "<0.00025". See Wambaugh et al. (2019) for more details. If argument median.only == TRUE then only the median is reported for parameters with Bayesian analysis distributions. If the 95 credible interval is larger than fup.ci.cutoff (defaults to NULL) then the Fup is treated as too uncertain and the value NA is given.
vector/data.table |
Table (if info has multiple entries) or vector containing a column for each valid entry specified in the argument "info" and a row for each chemical with sufficient data for the model specified by argument "model":
|
John Wambaugh, Robert Pearce, and Sarah E. Davidson
Rotroff, Daniel M., et al. "Incorporating human dosimetry and exposure into high-throughput in vitro toxicity screening." Toxicological Sciences 117.2 (2010): 348-358.
Waters, Nigel J., et al. "Validation of a rapid equilibrium dialysis approach for the measurement of plasma protein binding." Journal of pharmaceutical sciences 97.10 (2008): 4586-4595.
Wambaugh, John F., et al. "Assessing toxicokinetic uncertainty and variability in risk prioritization." Toxicological Sciences 172.2 (2019): 235-251.
# List all CAS numbers for which the 3compartmentss model can be run in humans:
get_cheminfo()
get_cheminfo(info=c('compound','funbound.plasma','logP'),model='pbtk')
# See all the data for humans:
get_cheminfo(info="all")
TPO.cas <- c("741-58-2", "333-41-5", "51707-55-2", "30560-19-1", "5598-13-0",
"35575-96-3", "142459-58-3", "1634-78-2", "161326-34-7", "133-07-3", "533-74-4",
"101-05-3", "330-54-1", "6153-64-6", "15299-99-7", "87-90-1", "42509-80-8",
"10265-92-6", "122-14-5", "12427-38-2", "83-79-4", "55-38-9", "2310-17-0",
"5234-68-4", "330-55-2", "3337-71-1", "6923-22-4", "23564-05-8", "101-02-0",
"140-56-7", "120-71-8", "120-12-7", "123-31-9", "91-53-2", "131807-57-3",
"68157-60-8", "5598-15-2", "115-32-2", "298-00-0", "60-51-5", "23031-36-9",
"137-26-8", "96-45-7", "16672-87-0", "709-98-8", "149877-41-8", "145701-21-9",
"7786-34-7", "54593-83-8", "23422-53-9", "56-38-2", "41198-08-7", "50-65-7",
"28434-00-6", "56-72-4", "62-73-7", "6317-18-6", "96182-53-5", "87-86-5",
"101-54-2", "121-69-7", "532-27-4", "91-59-8", "105-67-9", "90-04-0",
"134-20-3", "599-64-4", "148-24-3", "2416-94-6", "121-79-9", "527-60-6",
"99-97-8", "131-55-5", "105-87-3", "136-77-6", "1401-55-4", "1948-33-0",
"121-00-6", "92-84-2", "140-66-9", "99-71-8", "150-13-0", "80-46-6", "120-95-6",
"128-39-2", "2687-25-4", "732-11-6", "5392-40-5", "80-05-7", "135158-54-2",
"29232-93-7", "6734-80-1", "98-54-4", "97-53-0", "96-76-4", "118-71-8",
"2451-62-9", "150-68-5", "732-26-3", "99-59-2", "59-30-3", "3811-73-2",
"101-61-1", "4180-23-8", "101-80-4", "86-50-0", "2687-96-9", "108-46-3",
"95-54-5", "101-77-9", "95-80-7", "420-04-2", "60-54-8", "375-95-1", "120-80-9",
"149-30-4", "135-19-3", "88-58-4", "84-16-2", "6381-77-7", "1478-61-1",
"96-70-8", "128-04-1", "25956-17-6", "92-52-4", "1987-50-4", "563-12-2",
"298-02-2", "79902-63-9", "27955-94-8")
httk.TPO.rat.table <- subset(get_cheminfo(info="all",species="rat"),
CAS %in% TPO.cas)
httk.TPO.human.table <- subset(get_cheminfo(info="all",species="human"),
CAS %in% TPO.cas)
# create a data.frame with all the Fup values, we ask for model="schmitt" since
# that model only needs fup, we ask for "median.only" because we don't care
# about uncertainty intervals here:
fup.tab <- get_cheminfo(info="all",median.only=TRUE,model="schmitt")
# calculate the median, making sure to convert to numeric values:
median(as.numeric(fup.tab$Human.Funbound.plasma),na.rm=TRUE)
# calculate the mean:
mean(as.numeric(fup.tab$Human.Funbound.plasma),na.rm=TRUE)
# count how many non-NA values we have (should be the same as the number of
# rows in the table but just in case we ask for non NA values:
sum(!is.na(fup.tab$Human.Funbound.plasma))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.