optimalCutoff

Description

Compute the optimal probability cutoff score, based on a user defined objective.

Usage

1
2
optimalCutoff(actuals, predictedScores, optimiseFor = "misclasserror",
  returnDiagnostics = FALSE)

Arguments

actuals

The actual binary flags for the response variable. It can take a numeric vector containing values of either 1 or 0, where 1 represents the 'Good' or 'Events' while 0 represents 'Bad' or 'Non-Events'.

predictedScores

The prediction probability scores for each observation. If your classification model gives the 1/0 predcitions, convert it to a numeric vector of 1's and 0's.

optimiseFor

The maximization criterion for which probability cutoff score needs to be optimised. Can take either of following values: "Ones" or "Zeros" or "Both" or "misclasserror"(default). If "Ones" is used, 'optimalCutoff' will be chosen to maximise detection of "One's". If 'Both' is specified, the probability cut-off that gives maximum Youden's Index is chosen. If 'misclasserror' is specified, the probability cut-off that gives minimum mis-clasification error is chosen.

returnDiagnostics

If TRUE, would return additional diagnostics such as 'sensitivityTable', 'misclassificationError', 'TPR', 'FPR' and 'specificity' for the chosen cut-off.

Details

Compute the optimal probability cutoff score for a given set of actuals and predicted probability scores, based on a user defined objective, which is specified by optimiseFor = "Ones" or "Zeros" or "Both" (default).

Value

The optimal probability score cutoff that maximises a given criterion. If 'returnDiagnostics' is TRUE, then the following items are returned in a list:

  • optimalCutoff The optimal probability score cutoff that maximises a given criterion.

  • sensitivityTable The dataframe that shows the TPR, FPR, Youden's Index and Specificity for variaous values of purbability cut-off scores.

  • misclassificationError The percentage misclassification error for the given actuals and probaility scores.

  • TPR The 'True Positive Rate' (a.k.a 'sensitivity')for the chosen probability cut-off score.

  • FPR The 'False Positive Rate' (a.k.a 'sensitivity')for the chosen probability cut-off score.

  • Specificity The specificity of the given actuals and probability scores, i.e. the ratio of number of observations without the event AND predicted to not have the event divided by the number of observations without the event.

Author(s)

Selva Prabhakaran selva86@gmail.com

Examples

1
2
3
data('ActualsAndScores')
optimalCutoff(actuals=ActualsAndScores$Actuals,
predictedScores=ActualsAndScores$PredictedScores, optimiseFor="Both", returnDiagnostics=TRUE)