performance: Performance estimation
In fdm2id: Data Mining and R Programming for Beginners

performance

R Documentation

Performance estimation

Description

Estimate the performance of classification or regression methods using bootstrap or crossvalidation (accuracy, ROC curves, confusion matrices, ...)

Usage

performance(
  methods,
  train.x,
  train.y,
  test.x = NULL,
  test.y = NULL,
  train.size = round(0.7 * nrow(train.x)),
  type = c("evaluation", "confusion", "roc", "cost", "scatter", "avsp"),
  protocol = c("bootstrap", "crossvalidation", "loocv", "holdout", "train"),
  eval = ifelse(is.factor(train.y), "accuracy", "r2"),
  nruns = 10,
  nfolds = 10,
  new = TRUE,
  lty = 1,
  seed = NULL,
  methodparameters = NULL,
  names = NULL,
  ...
)

Arguments

`methods`	The classification or regression methods to be evaluated.
`train.x`	The dataset (description/predictors), a `matrix` or `data.frame`.
`train.y`	The target (class labels or numeric values), a `factor` or `vector`.
`test.x`	The test dataset (description/predictors), a `matrix` or `data.frame`.
`test.y`	The (test) target (class labels or numeric values), a `factor` or `vector`.
`train.size`	The size of the training set (holdout estimation).
`type`	The type of evaluation (confusion matrix, ROC curve, ...)
`protocol`	The evaluation protocol (crossvalidation, bootstrap, ...)
`eval`	The evaluation functions.
`nruns`	The number of bootstrap runs.
`nfolds`	The number of folds (crossvalidation estimation).
`new`	A logical value indicating whether a new plot should be be created or not (cost curves or ROC curves).
`lty`	The line type (and color) specified as an integer (cost curves or ROC curves).
`seed`	A specified seed for random number generation (useful for testing different method with the same bootstap samplings).
`methodparameters`	Method parameters (if null tuning is done by cross-validation).
`names`	Method names.
`...`	Other specific parameters for the leaning method.

Value

The evaluation of the predictions (numeric value).

Examples

## Not run: 
require ("datasets")
data (iris)
# One method, one evaluation criterion, bootstrap estimation
performance (NB, iris [, -5], iris [, 5], seed = 0)
# One method, two evaluation criteria, train set estimation
performance (NB, iris [, -5], iris [, 5], eval = c ("accuracy", "kappa"),
             protocol = "train", seed = 0)
# Three methods, ROC curves, LOOCV estimation
performance (c (NB, LDA, LR), linsep [, -3], linsep [, 3], type = "roc",
             protocol = "loocv", seed = 0)
# List of methods in a variable, confusion matrix, hodout estimation
classif = c (NB, LDA, LR)
performance (classif, iris [, -5], iris [, 5], type = "confusion",
             protocol = "holdout", seed = 0, names = c ("NB", "LDA", "LR"))
# List of strings (method names), scatterplot evaluation, crossvalidation estimation
classif = c ("NB", "LDA", "LR")
performance (classif, iris [, -5], iris [, 5], type = "scatter",
             protocol = "crossvalidation", seed = 0)
# Actual vs. predicted
data (trees)
performance (LINREG, trees [, -3], trees [, 3], type = "avsp")

## End(Not run)

fdm2id documentation built on July 9, 2023, 6:05 p.m.