topPerformers: Obtain the best scores from a performance estimation...
In performanceEstimation: An Infra-Structure for Performance Estimation of Predictive Models

Description Usage Arguments Details Value Author(s) References See Also Examples

This function can be used to obtain the names of the workflows that obtained the best scores (the top performers) on an experimental comparison. This information will be shown for each of the evaluation metrics involved in the comparison and also for all predictive tasks that were used.

1
2
3

topPerformers(compRes,
           maxs=rep(FALSE,dim(compRes[[1]][[1]]@iterationsScores)[2]),
           stat="avg",digs=3)

`compRes`	A `ComparisonResults` object with the results of your experimental comparison.
`maxs`	A vector of booleans with as many elements are there are metrics estimated in the experimental comparison. A `TRUE` value means the respective statistic is to be maximized, while a `FALSE` means minimization. Defaults to all `FALSE` values, i.e. all metrics are to be minimized.
`stat`	The statistic to be used to obtain the ranks. The options are the statistics produced by the function `summary` applied to objects of class `ComparisonResults`, i.e. "avg", "std", "med", "iqr", "min", "max" or "invalid" (defaults to "avg").
`digs`	The number of digits (defaults to 3) used in the scores column of the results.

This is an utility function to check which were the top performers in a comparative experiment for each data set and each evaluation metric. The notion of best performance depends on the type of evaluation metric, thus the need for the second argument. Some evaluation statistics are to be maximized (e.g. accuracy), while others are to be minimized (e.g. mean squared error). If you have a mix of these types on your experiment then you can use the maxs parameter to inform the function of which are to be maximized and minimized.

The function returns a list with named components. The components correspond to the predictive tasks used in the experimental comparison. For each component you get a data.frame, where the rows represent the statistics. For each statistic you get the name of the top performer (1st column of the data frame) and the respective score on that statistic (2nd column).

Luis Torgo ltorgo@dcc.fc.up.pt

Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436

performanceEstimation, topPerformer, rankWorkflows, metricsSummary

## Not run: 
## Estimating several evaluation metrics on different variants of a
## regression tree and of a SVM, on  two data sets, using one repetition
## of  10-fold CV

data(swiss)
data(mtcars)
library(e1071)

## run the experimental comparison
results <- performanceEstimation(
               c(PredTask(Infant.Mortality ~ ., swiss),
                 PredTask(mpg ~ ., mtcars)),
               c(workflowVariants(learner='svm',
                                  learner.pars=list(cost=c(1,5),gamma=c(0.1,0.01))
                                 )
               ),
               EstimationTask(metrics=c("mse","mae"),method=CV(nReps=2,nFolds=5))
                                 )
## get the top performers for each task and evaluation metric
topPerformers(results)

## End(Not run)