Description Usage Arguments Details Value Examples
This function can easily create a MLR-QSAR model from a proper set of compounds.
1 2 3 4 5 | ezqsar_f(SDFfile, activityfile, propertyfield = "title",
propertyfield_newset = "title", propertyfield_newset2 = "title",
Nofdesc = 6, correlation = 1, partition = 0.8,
des_sel_meth = "forward", testset = 0, newdataset = 0,
newdataset2 = 0, activity = 0, Cutoff = 3)
|
SDFfile |
It is a sdf file that includes structures of the all molecules (2D or 3D) |
activityfile |
It is a csv file that contains activity data for the molecules. It should contain names of the molecules same as they are specified in a field of the SDFfile and the header of the column should be "name". The ranking of the compound also should be same in activityfile and SDFfile. The second column of the IC50 CSV file should contain reported activities of the molecules and the header of the column should be "IC50". |
propertyfield |
It is a name of a field in the SDFfile that contains names of the molecules same as in the first coloumn of the activityfile (default is "title") |
propertyfield_newset |
It is a name of a field in the newdataset SDFfile that contains names of the molecules (default is "title") |
propertyfield_newset2 |
It is a name of a field in the newdataset2 SDFfile that contains names of the molecules (default is "title") |
Nofdesc |
It is maximum Number of descriptors that can be present in the developed MLR model (3-6 is recomended, default is 6) |
correlation |
It indicates level of correlation defined to omit highly correlated descriptors (default is 1) |
partition |
It indicates the partition of the train set (default is 0.8) |
des_sel_meth |
It defines a method for variable (descriptor) selection before MLR (it could be "exhaustive","backward", "forward" (default) or "seqrep") |
testset |
If it is equal to zero (default) the test set will be selected according to the IC50 values and partition parameter otherwise a vector containing the row numbers of test set can be provided here (eg. c(5, 8, 12, 21)). |
newdataset |
It is an optional sdf file that includes new set molecules to predict their activity by the developed QSAR model (2D or 3D) |
newdataset2 |
It is a secondary optional sdf file |
activity |
If it is equal to zero (default) it will be assumed that the reported activities are expressed in -log IC50 otherwise it will be considred #' as original IC50 values. |
Cutoff |
It is a cut off value for reporting a descriptor value of a molecule as an outlier in AD_outlier_train, AD_outlier_test, AD_outlier_newset and AD_outlier_newset2 tables (default is 3). |
It takes a structure file and an activity file and will give a cross-validated QSAR model as well as external test set prediction results. A table of calculated descriptors and a plot showing observed vs predicted activity of train and test sets will be generated in the working directory.
An object with several attributes
1 2 3 4 5 6 7 8 9 10 11 12 13 | file1<-system.file("extdata", "molecules-3d.sdf", package = "ezqsar")
file2<-system.file("extdata", "IC50.csv", package = "ezqsar")
file3<-system.file("extdata", "newset-3d.sdf", package = "ezqsar")
model<-ezqsar_f(SDFfile=file1, activityfile=file2, newdataset=file3, testset=c(4,6,12,22))
attributes (model)
print (model$Q2)
print (model$R2)
print (model$test)
print (model$R2_pred)
print (model$Tanimoto_test_sum)
print (model$AD_outlier_test)
print (model$newset)
print (model$Tanimoto_newset_sum)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.