classifyPairs: Classification model for pairs of algorithms
In llama: Leveraging Learning to Automatically Manage Algorithms

Description Usage Arguments Details Value Author(s) References See Also Examples

Build a classification model for each pair of algorithms that predicts which one is better based on the features of the problem. Predictions are aggregated to determine the best overall algorithm.

1
2
3

classifyPairs(classifier = NULL, data = NULL,
    pre = function(x, y=NULL) { list(features=x) }, combine = NULL,
    save.models = NA, use.weights = TRUE)

`classifier`	the mlr classifier to use. See examples.
`data`	the data to use with training and test sets. The structure returned by one of the partitioning functions.
`pre`	a function to preprocess the data. Currently only `normalize`. Optional. Does nothing by default.
`combine`	The classifier function to predict the overall best algorithm given the predictions for pairs of algorithms. Optional. By default, the overall best algorithm is determined by majority vote.
`save.models`	Whether to serialize and save the models trained during evaluation of the model. If not `NA`, will be used as a prefix for the file name.
`use.weights`	Whether to use instance weights if supported. Default `TRUE`.

classifyPairs takes the training and test sets in data and processes it using pre (if supplied). classifier is called to induce a classifier for each pair of algorithms to predict which one is better. If combine is not supplied, the best overall algorithm is determined by majority vote. If it is supplied, it is assumed to be a classifier with the same properties as the other one. This classifier is passed the original features and the predictions for each pair of algorithms.

Which algorithm is better of a pair is determined by comparing their performance scores. Whether a lower performance number is better or not is determined by what was specified when the LLAMA data frame was created.

The evaluation across the training and test sets will be parallelized automatically if a suitable backend for parallel computation is loaded. The parallelMap level is "llama.fold".

If the given classifier supports case weights and use.weights is TRUE, the performance difference between the best and the worst algorithm is passed as a weight for each instance.

If all predictions of an underlying machine learning model are NA, it will count as 0 towards the score.

Training this model can take a very long time. Given n algorithms, choose(n, 2) models are trained and evaluated. This is significantly slower than the other approaches that train a single model or one for each algorithm.

If save.models is not NA, the models trained during evaluation are serialized into files. Each file contains a list with members model (the mlr model), train.data (the mlr task with the training data), and test.data (the data frame with the test data used to make predictions). The file name starts with save.models, followed by the ID of the machine learning model, followed by "combined" if the model combines predictions of other models, followed by the number of the fold. Each model for each fold is saved in a different file.

`predictions`	a data frame with the predictions for each instance and test set. The columns of the data frame are the instance ID columns (as determined by `input`), the algorithm, the score of the algorithm, and the iteration (e.g. the number of the fold for cross-validation). More than one prediction may be made for each instance and iteration. The score corresponds to the number of times the respective algorithm was predicted to be better. If stacking is used, only the best algorithm for each algorithm-instance pair is predicted with a score of 1.
`predictor`	a function that encapsulates the classifier learned on the entire data set. Can be called with data for the same features with the same feature names as the training data to obtain predictions in the same format as the `predictions` member.
`models`	the models for each pair of algorithms trained on the entire data set. This is meant for debugging/inspection purposes and does not include any models used to combine predictions of individual models.

Lars Kotthoff

Xu, L., Hutter, F., Hoos, H. H., Leyton-Brown, K. (2011) Hydra-MIP: Automated Algorithm Configuration and Selection for Mixed Integer Programming. RCRA Workshop on Experimental Evaluation of Algorithms for Solving Problems with Combinatorial Explosion, 16–30.

classify, cluster, regression, regressionPairs

if(Sys.getenv("RUN_EXPENSIVE") == "true") {
data(satsolvers)
folds = cvFolds(satsolvers)

res = classifyPairs(classifier=makeLearner("classif.J48"), data=folds)
# the total number of successes
sum(successes(folds, res))
# predictions on the entire data set
res$predictor(satsolvers$data[satsolvers$features])

# use probabilities instead of labels
res = classifyPairs(classifier=makeLearner("classif.randomForest",
                                predict.type = "prob"), data=folds)

# combine predictions using J48 induced classifier instead of majority vote
res = classifyPairs(classifier=makeLearner("classif.J48"),
                    data=folds,
                    combine=makeLearner("classif.J48"))
}