Description Arguments Methods Examples
SMILES generator thanks to a sequential Monte-Carlo sampler
smis |
is an initial vector of SMILES from which the generation of novel SMILES begins. |
v_engram |
is an Engram object, a priori created, encapsulating the SMILES grammar
(see |
v_m |
is a positive scalar representing the order of a used ENgram model for generation. |
v_qsprpred |
is a QSPRpred object in which a regression model, initially trained, is accessible for properties predictions (i.e. physico-chemical properties here) of compounds from newly created SMILES. |
v_temp |
is a vector of numerical values, with a length equals to the number of properties, which represents the annealing parametrization in the sequential Monte-Carlo sampler. |
v_decay |
is a positive scalar corresponding to the decay rate of temp above (temp_{i+1}=temp_{i}^decay). |
v_ESSth |
is a positive scalar representing the threshold from which a re-sampling over the set of newly created SMILES is done (0.5 by default). This threshold limits the degeneracy in the set of newly created SMILES. A lower (higher) value allows more (less) degeneracy. |
gentype |
is the type of the procedure used by the SMILES strings generator. For a Back-off procedure, use "ML" (by default), and for a Neaser-Nay smoothing procedure, use "KN". |
v_maxstock |
is the maximum of newly created SMILES kept in stock (2000 by default). |
keeptrack |
is set to TRUE by default. It allows the tracking of the mean of predicted properties, and thus the plotting and/or listing of the latest newly created SMILES during the generation process. It is extremely useful in order to tune the annealing parameters, as to visualize the convergence speed to a targeted physico-chemical properties space. |
smidatabase |
is a vector of known SMILES to which the generated SMILES should not match. This is useful to avoid the creation of SMILES with great similarity with existing and/or un-wanted ones. |
get_hiscores(nsmi = 100, exsim = 0.8)
get chemical structures with high QSPR score from SmcChem object (same as get_hiscores function)
get_smiles()
get SMILES strings from the SmcChem object (same as get_smiles function)
initialize(smis = NULL, v_engram = NULL, v_m = NULL, v_qsprpred = NULL,
v_temp = c(1, 1), v_decay = 0.95, v_ESSth = 0.5, gentype = "ML",
v_maxstock = 2000, keeptrack = TRUE, smidatabase = NULL)
Initialize the SMC chemical generator with initial SMILES strings smis, ENgram class object v_engram and QSPRpred class object v_qsprpred
smcexec(niter, nsteps = 5, preorder = 0, nview = 0)
modify chemical structures with niter SMC updates
viewstr(idx)
view 2D structures from SMILES string vector with index idx (same as viewstr function)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | ## Not run: #sample data
data(qspr.data)
idx <- sample(nrow(qspr.data), 5000)
smis <- paste(qspr.data[idx,1])
y <- qspr.data[idx,c(2,5)]
#learning a pattern of chemical strings
data(trainedSMI)
data(engram_5k) #same as run => engram <- ENgram$new(trainedSMI, order=10)
#learning QSPR model
data(qsprpred_EG_5k)
#same as run => qsprpred <- QSPRpred$new(smis=smis, y=as.matrix(y), v_fpnames="graph")
#set target range
targ.min <- c(200,1.5)
targ.max <- c(350,2.5)
qsprpred_EG_5k$set_target(targ.min,targ.max)
#getting chemical strings from the Inverse-QSPR model
smchem <- SmcChem$new(smis = rep("c1ccccc1O", 25), v_qsprpred=qsprpred_EG_5k,
v_engram=engram_5k,temp=3)
smchem$smcexec(niter=5, preorder=0, nview=4)
#if OpenBabel (>= 2.3.1) is installed, you can use reordering for better mixing as
#smchem$smcexec(niter=100, preorder=0.2, nview=4)
#see http://openbabel.org
#check
gensmis <- smchem$get_hiscores(nsmi=5, exsim=0.9)
pred <- qsprpred_EG_5k$qspr_predx(gensmis[,1])
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.