superSumFun: Custom two-class summary function

View source: R/train_summary.R

superSumFunR Documentation

Custom two-class summary function

Description

Function used to compute performance metrics when running train.

Usage

superSumFun(data, lev = NULL, model = NULL)

Arguments

data

a data frame with columns obs and pred for the observed and predicted outcomes, and columns with predicted probabilities for each outcome class. See the classProbs argument to trainControl.

lev

a character vector of factors levels for the response. First element is passed to confusionMatrix's positive.

model

a character string for the model name (as taken from the method argument of train.)

pred

A vector of numeric data (could be a factor)

obs

A vector of numeric data (could be a factor)

Details

The following metrics are returned as a named numeric vector:

  • Accuracy: (TP+TN)/N

  • AccuracyNull: Prevalence of "positive" class

  • AccuracyPValue: p-value of Accuracy compared to AccuracyNull

  • Balanced Accuracy: (Sensitivity+Specificity)/2

  • Precision: TP/(TP+FP) ('How many instance labeled positive are correctly classified?')

  • Recall: TP/(TP+FN) ('How many of truly positive instances are labeled correctly?')

  • Sensitivity: TP/(TP+FN) = Recall (true-positive rate)

  • Specificity: TN/(TN+FP) = 1/Recall (inverse Recall, true-negative rate)

  • Kappa:

  • logLoss: negative log-likelihood of the binomial distribution

  • AUC: Area under the Receiver Operating Characteristic (ROC) curve

  • PR-AUC: Area under the Precision-Recall ROC curve

  • F0.5: F-measure (see Notes) with β = .5 (twice as much weight on Precision as on Recall)

  • F1: F-measure (see Notes) with β = 1 (Precision and Recall weighted equally)

  • F2: F-measure (see Notes) with β = 2 (twice as much weight on Recall as on Precision)

Note: The F-measure is computed as (1+β²) x (Precision x Recall)/(β²xPrecision + Recall)

Examples

## Not run: 
library(dplyr)
dat <- data.frame(
  pred = sample(1:2, 10, replace = T)
  , obs = sample(1:2, 10, replace = T)
) %>% 
  mutate(
    `1` = ifelse(pred == 1, sample(seq(.51, .99, length.out = 100), 10), sample(seq(0.01, .49, length.out = 100), 10))
    , `2` = 1-`1`
  ) %>% 
  mutate_at(1:2, as.factor)
superSumFun(dat, levels(dat$obs))

## End(Not run) 


haukelicht/politicaltweets documentation built on July 3, 2023, 4:11 a.m.