This function obtains a Critical Difference (CD) diagram for the post-hoc Nemenyi test in the lines defined by Demsar (2006). These diagrams provide an interesting visualization of the statistical significance of the observed paired differences between a set of workflows on a set of predictive tasks. They allow us to compare all workflows against each other on these set of tasks and check the results of all these paired comparisons.
A list resulting from a call to
The metric for which the CD diagram will be obtained (defaults to the first metric of the comparison).
Critical Difference (CD) diagrams are interesting sucint visualizations of the results of a Nemenyi post-hoc test that is designed to check the statistical significance between the differences in average rank of a set of workflows on a set of predictive tasks.
In the resulting graph each workflow is represented by a colored
line. The X axis where the lines end represents the average rank position
of the respective workflow across all tasks. The null hypothesis is that
the average ranks of each pair of workflows to not differ with
statistical significance (at some confidence level defined in the call
pairedComparisons that creates the object used to
obtain these graphs). Horizontal lines connect the lines of the
workflows for which we cannot exclude the hypothesis that their average
ranks is equal. Any pair of workflows whose lines are not connected with an
horizontal line can be seen as having an average rank that is different
with statistical significance. On top of the graph an horizontal line is
shown with the required difference between the average ranks (known as
the critical difference) for two pair of workflows to be considered
Nothing, the graph is draw on the current device.
Luis Torgo email@example.com
Demsar, J. (2006) Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7, 1-30.
Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
## Not run: ## Estimating MSE for 3 variants of both ## regression trees and SVMs, on two data sets, using one repetition ## of 10-fold CV library(e1071) data(iris) data(Satellite,package="mlbench") data(LetterRecognition,package="mlbench") ## running the estimation experiment res <- performanceEstimation( c(PredTask(Species ~ .,iris), PredTask(classes ~ .,Satellite,"sat"), PredTask(lettr ~ .,LetterRecognition,"letter")), workflowVariants(learner="svm", learner.pars=list(cost=1:4,gamma=c(0.1,0.01))), EstimationTask(metrics=c("err","acc"),method=CV())) ## checking the top performers topPerformers(res) ## now let us assume that we will choose "svm.v2" as our baseline ## carry out the paired comparisons pres <- pairedComparisons(res,"svm.v2") ## obtaining a CD diagram comparing all workflows against ## each other CDdiagram.Nemenyi(pres,metric="err") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.