ezqsar_f: ezqsar_f

Description Usage Arguments Details Value Examples

Description

This function can easily create a MLR-QSAR model from a proper set of compounds.

Usage

1
2
3
4
5
ezqsar_f(SDFfile, activityfile, propertyfield = "title",
  propertyfield_newset = "title", propertyfield_newset2 = "title",
  Nofdesc = 6, correlation = 1, partition = 0.8,
  des_sel_meth = "forward", testset = 0, newdataset = 0,
  newdataset2 = 0, activity = 0, Cutoff = 3)

Arguments

SDFfile

It is a sdf file that includes structures of the all molecules (2D or 3D)

activityfile

It is a csv file that contains activity data for the molecules. It should contain names of the molecules same as they are specified in a field of the SDFfile and the header of the column should be "name". The ranking of the compound also should be same in activityfile and SDFfile. The second column of the IC50 CSV file should contain reported activities of the molecules and the header of the column should be "IC50".

propertyfield

It is a name of a field in the SDFfile that contains names of the molecules same as in the first coloumn of the activityfile (default is "title")

propertyfield_newset

It is a name of a field in the newdataset SDFfile that contains names of the molecules (default is "title")

propertyfield_newset2

It is a name of a field in the newdataset2 SDFfile that contains names of the molecules (default is "title")

Nofdesc

It is maximum Number of descriptors that can be present in the developed MLR model (3-6 is recomended, default is 6)

correlation

It indicates level of correlation defined to omit highly correlated descriptors (default is 1)

partition

It indicates the partition of the train set (default is 0.8)

des_sel_meth

It defines a method for variable (descriptor) selection before MLR (it could be "exhaustive","backward", "forward" (default) or "seqrep")

testset

If it is equal to zero (default) the test set will be selected according to the IC50 values and partition parameter otherwise a vector containing the row numbers of test set can be provided here (eg. c(5, 8, 12, 21)).

newdataset

It is an optional sdf file that includes new set molecules to predict their activity by the developed QSAR model (2D or 3D)

newdataset2

It is a secondary optional sdf file

activity

If it is equal to zero (default) it will be assumed that the reported activities are expressed in -log IC50 otherwise it will be considred #' as original IC50 values.

Cutoff

It is a cut off value for reporting a descriptor value of a molecule as an outlier in AD_outlier_train, AD_outlier_test, AD_outlier_newset and AD_outlier_newset2 tables (default is 3).

Details

It takes a structure file and an activity file and will give a cross-validated QSAR model as well as external test set prediction results. A table of calculated descriptors and a plot showing observed vs predicted activity of train and test sets will be generated in the working directory.

Value

An object with several attributes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
file1<-system.file("extdata", "molecules-3d.sdf", package = "ezqsar")
file2<-system.file("extdata", "IC50.csv", package = "ezqsar")
file3<-system.file("extdata", "newset-3d.sdf", package = "ezqsar")
model<-ezqsar_f(SDFfile=file1, activityfile=file2, newdataset=file3, testset=c(4,6,12,22))
attributes (model)
print (model$Q2)
print (model$R2)
print (model$test)
print (model$R2_pred)
print (model$Tanimoto_test_sum)
print (model$AD_outlier_test)
print (model$newset)
print (model$Tanimoto_newset_sum)

shamsaraj/ezqsar documentation built on Dec. 3, 2019, 4:03 p.m.