knitr::opts_chunk$set( comment = "#>", collapse = TRUE ) run <- if (tidychem::is_installed_rdkit()) TRUE else FALSE knitr::opts_chunk$set(eval = run)
The tidychem
package offers a lightweight R interface for accessing RDKit via the RDKit Python API.
library("tidychem")
Chemical data format intro: SMI and SDF/MOL.
Reading. Parsing (Error handling example -- will be NULL
.). Writing.
mols <- "smi-multiple.smi" |> tidychem_example() |> read_smiles() # ECFP4 mols |> fp_morgan() # similarity # mols |> fp_morgan |> sim_tanimoto # matrix mols |> fp_morgan(explicit = TRUE)
2D descriptors and 3D descriptors.
The 3D follow a common workflow: 3D formance -> descriptor...
If already optimized with 3D coordinates, load them with parse_sdf
or read_sdf
directly, then compute the 3D descriptors with the vanilla option.
df <- "logd74.tsv" |> tidychem_example() |> read_tsv() y <- df$logD7.4 mols <- df$SMILES |> parse_smiles() mols # matrix of 2D/3D descriptors x <- mols |> desc_2d() x x[which(is.na(x))] <- 0
library("glmnet") cvfit <- cv.glmnet(x, y) plot(cvfit) fit <- glmnet(x, y) plot(fit) head(coef(fit, s = cvfit$lambda.min, exact = TRUE), n = 30)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.