superSumFun: Custom two-class summary function
In haukelicht/politicaltweets: Classify political tweets

superSumFun

R Documentation

Custom two-class summary function

Description

Function used to compute performance metrics when running train.

Usage

superSumFun(data, lev = NULL, model = NULL)

Arguments

`data`	a data frame with columns `obs` and `pred` for the observed and predicted outcomes, and columns with predicted probabilities for each outcome class. See the `classProbs` argument to `trainControl`.
`lev`	a character vector of factors levels for the response. First element is passed to `confusionMatrix`'s `positive`.
`model`	a character string for the model name (as taken from the `method` argument of `train`.)
`pred`	A vector of numeric data (could be a factor)
`obs`	A vector of numeric data (could be a factor)

Details

The following metrics are returned as a named numeric vector:

Accuracy: (TP+TN)/N
AccuracyNull: Prevalence of "positive" class
AccuracyPValue: p-value of Accuracy compared to AccuracyNull
Balanced Accuracy: (Sensitivity+Specificity)/2
Precision: TP/(TP+FP) ('How many instance labeled positive are correctly classified?')
Recall: TP/(TP+FN) ('How many of truly positive instances are labeled correctly?')
Sensitivity: TP/(TP+FN) = Recall (true-positive rate)
Specificity: TN/(TN+FP) = 1/Recall (inverse Recall, true-negative rate)
Kappa:
logLoss: negative log-likelihood of the binomial distribution
AUC: Area under the Receiver Operating Characteristic (ROC) curve
PR-AUC: Area under the Precision-Recall ROC curve
F0.5: F-measure (see Notes) with β = .5 (twice as much weight on Precision as on Recall)
F1: F-measure (see Notes) with β = 1 (Precision and Recall weighted equally)
F2: F-measure (see Notes) with β = 2 (twice as much weight on Recall as on Precision)

Note: The F-measure is computed as (1+β²) x (Precision x Recall)/(β²xPrecision + Recall)

Examples

## Not run: 
library(dplyr)
dat <- data.frame(
  pred = sample(1:2, 10, replace = T)
  , obs = sample(1:2, 10, replace = T)
) %>% 
  mutate(
    `1` = ifelse(pred == 1, sample(seq(.51, .99, length.out = 100), 10), sample(seq(0.01, .49, length.out = 100), 10))
    , `2` = 1-`1`
  ) %>% 
  mutate_at(1:2, as.factor)
superSumFun(dat, levels(dat$obs))

## End(Not run)

haukelicht/politicaltweets documentation built on July 3, 2023, 4:11 a.m.