plot.chemmodlab | R Documentation |
plot.chemmodlab
takes a chemmodlab
object output by the
ModelTrain
function and creates a series of accumulation curve
plots for assesing model and descriptor set performance.
## S3 method for class 'chemmodlab'
plot(
x,
max.select = NA,
splits = 1:x$nsplits,
meths = x$models,
series = "both",
...
)
x |
an object of class |
max.select |
the maximum number of tests to plot for the
accumulation curve. If |
splits |
a numeric vector containing the indices of the splits to use to construct
accumulation curves. Default is to use all splits. |
meths |
a character vector with statistical methods implemented in
|
series |
a character vector. Which series of plots to construct. Can be one of
|
... |
other parameters to be passed through to plotting functions. |
For a binary response, the accumulation curve plots the number of assay hits identified as a function of the number of tests conducted, where testing order is determined by the predicted probability of a response being positive obtained from k-fold cross validation. Given a particular compound collection, larger accumulations are preferable.
The accumulation curve has also been extended to continuous responses.
Assuming large positive values of a continuous response y are preferable,
chemmodlab
accumulates y
so that \sum y_i
is the sum of the y
over the first n
tests. This extension includes the binary-response
accumulation curve as a special case.
By default, we display accumulation curves up to 300 tests, not for the entire collection, to focus on the goal of finding actives as early as possible.
There are two main series of plots generated:
There is one plot per CV split and descriptor set combination. The accumulation curves for each modeling method is compared.
There is one plot per CV split and model fit. The accumulation curves for each descriptor set is compared.
Jacqueline Hughes-Oliver, Jeremy Ash, Atina Brooks
Modified from code originally written by William J. Welch 2001-2002
chemmodlab
, ModelTrain
## Not run:
# A data set with binary response and multiple descriptor sets
data(aid364)
cml <- ModelTrain(aid364, ids = TRUE, xcol.lengths = c(24, 147),
des.names = c("BurdenNumbers", "Pharmacophores"))
plot(cml)
## End(Not run)
# A continuous response
cml <- ModelTrain(USArrests, nsplits = 2, nfolds = 2,
models = c("KNN", "Lasso", "Tree"))
plot(cml)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.