plot.CoreModel | R Documentation |
The method plot
visualizes the models returned by CoreModel()
function or summaries obtained by applying these models to data.
Different plots can be produced depending on the type of the model.
## S3 method for class 'CoreModel' plot(x, trainSet, rfGraphType=c("attrEval", "outliers", "scaling", "prototypes", "attrEvalCluster"), clustering=NULL, ...)
x |
The model structure as returned by |
trainSet |
The data frame containing training data which produced the model |
rfGraphType |
The type of the graph to produce for random forest models. See details. |
clustering |
The clustering of the training instances used in some model types. See details. |
... |
Other options controlling graphical output passed to additional graphical functions. |
The output of function CoreModel
is visualized. Depending on the model type, different visualizations
are produced. Currently, classification tree, regression tree, and random forests are supported
(models "tree", "regTree", "rf", and "rfNear").
For classification and regression trees (models "tree" and "regTree") the visualization produces a graph
representing structure
of classification and regression tree, respectively. This process exploits graphical capabilities of
rpart.plot
package. Internal structures of
CoreModel
are converted to rpart.object
and then visualized by calling
rpart.plot
using default parameters. Any additional parameters are passed on to this function. For further
control use the getRpartModel
function and call the function rpart.plot
or plot.rpart
with different parameters.
Note that rpart.plot
can only display a single value in a leaf, which is not appropriate for model trees using e.g.,
linear regression in the leaves. For these cases function display
is a better alternative.
For random forest models (models "rf" and "rfNear") different types of visualizations can be produced depending on the
graphType
parameter:
"attrEval"
the attributes are evaluated with random forest model and the importance scores are then
visualized. For details see rfAttrEval
.
"attrEvalClustering"
similarly to the "attrEval"
the attributes are evaluated with random forest
model and the importance scores are then visualized, but the importance scores are generated
for each cluster separately. The parameter clustering
provides clustering information on
the trainSet
. If clustering
parameter is set to NULL, the class values are used as
clustering information and visualization of attribute importance for each class separately is
generated.
For details see rfAttrEvalClustering
.
"outliers"
the random forest proximity measure of training instances in trainSet
is visualized and outliers for each class separately can be detected.
For details see rfProximity
and rfOutliers
.
"prototypes"
typical instances are found based on predicted class probabilities
and their values are visualized (see classPrototypes
).
"scaling"
returns a scaling plot of training instances in a two dimensional space using
random forest based proximity as the distance (see rfProximity
and a scaling function cmdscale
).
The method returns no value.
John Adeyanju Alao (initial implementation) and Marko Robnik-Sikonja (integration, improvements)
Leo Breiman: Random Forests. Machine Learning Journal, 45:5-32, 2001
CoreModel
,
rfProximity
,
pam
,
rfClustering
,
rfAttrEvalClustering
,
rfOutliers
,
classPrototypes
,
cmdscale
# decision tree dataset <- iris md <- CoreModel(Species ~ ., dataset, model="tree") plot(md, dataset) # additional parameters are passed directly to rpart.plot # Additional visualizations can be obtained by explicit conversion to rpart.object #rpm <- getRpartModel(md,dataset) # and than setting graphical parameters in plot.rpart and text.rpart #require(rpart) # E.g., set angle to tan(0.5)=45 (degrees) and length of branches at least 5, # try to make a dendrogram more compact #plot(rpm, branch=0.5, minbranch=5, compress=TRUE) #(pretty=0) full names of attributes, numbers to 3 decimals, #text(rpm, pretty=0, digits=3) destroyModels(md) # clean up # regression tree dataset <- CO2 mdr <- CoreModel(uptake ~ ., dataset, model="regTree") plot(mdr, dataset) destroyModels(mdr) # clean up #random forests dataset <- iris mdRF <- CoreModel(Species ~ ., dataset, model="rf", rfNoTrees=30, maxThreads=1) plot(mdRF, dataset, rfGraphType="attrEval") plot(mdRF, dataset, rfGraphType="outliers") plot(mdRF, dataset, rfGraphType="scaling") plot(mdRF, dataset, rfGraphType="prototypes") plot(mdRF, dataset, rfGraphType="attrEvalCluster", clustering=NULL) destroyModels(mdRF) # clean up
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.