adjust_probability_calibration: Re-calibrate classification probability predictions

View source: R/adjust-probability-calibration.R

adjust_probability_calibrationR Documentation

Re-calibrate classification probability predictions

Description

Calibration is the process of adjusting a model's outputted probabilities to match the observed frequencies of events. This technique aims to ensure that when a model predicts a certain probability for an outcome, that probability accurately reflects the true likelihood of that outcome occurring.

Usage

adjust_probability_calibration(x, method = NULL, ...)

Arguments

x

A tailor().

method

Character. One of "logistic", "multinomial", "beta", "isotonic", "isotonic_boot", or "none", corresponding to the function from the probably package probably::cal_estimate_logistic(), probably::cal_estimate_multinomial(), etc., respectively. The default is to use "logistic" which, despite its name, fits a generalized additive model. Note that when fit.tailor() is called, the value may be changed to "none" if there is insufficient data.

...

Optional arguments to pass to the corresponding function in the probably package. These arguments must be named.

Details

The "logistic" and "multinomial" methods fit models that predict the observed classes as a function of the predicted class probabilities. These models remove any overt systematic trends from the linear predictor and correct new predictions. The underlying code fits that model using mgcv::gam(). If smooth = FALSE is passed to the ..., it uses stats::glm() for binary outcomes or nnet::multinom() for 3+ classes.

The isotonic method uses stats::isoreg() to force the predicted probabilities to increase with the observed outcome class. This creates a step function that will map new predictions to values that are monotonically increasing with the binary (0/1) form of the outcome. One side effect is that there are fewer, perhaps far fewer, unique predicted probabilities. For 3+ classes, this is done using a one-versus-all strategy that ensures that the probabilities add to 1.0. The "isotonic boot" method resamples the data and generates multiple isotonic regressions that are averaged and used to correct the predictions. This may not be perfectly monotonic, but the number of unique calibrated predictions increases with the number of bootstrap samples (controlled by passing the times argument to ...).

Beta calibration (Kull et al, 2017) assumes that the probability estimates follow a Beta distribution. This leads to a sigmoidal model that can be fit to the data via maximum likelihood. There are a few different ways to fit the model; see betacal:: beta_calibration() options parameters to select a specific sigmoidal model.

Value

An updated tailor() containing the new operation.

Data Usage

This adjustment requires estimation and, as such, different subsets of data should be used to train it and evaluate its predictions.

Note that, when calling fit.tailor(), if the calibration data have zero or one row, the method is changed to "none".

References

Kull, Meelis, Telmo Silva Filho, and Peter Flach. "Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers." Artificial intelligence and statistics. PMLR, 2017.

https://aml4td.org/chapters/cls-metrics.html#sec-cls-calibration

Examples


library(modeldata)

# split example data
set.seed(1)
in_rows <- sample(c(TRUE, FALSE), nrow(two_class_example), replace = TRUE)
d_calibration <- two_class_example[in_rows, ]
d_test <- two_class_example[!in_rows, ]

head(d_calibration)

# specify calibration
tlr <-
  tailor() |>
  adjust_probability_calibration(method = "logistic")

# train tailor on a subset of data.
tlr_fit <- fit(
  tlr,
  d_calibration,
  outcome = c(truth),
  estimate = c(predicted),
  probabilities = c(Class1, Class2)
)

# apply to predictions on another subset of data
head(d_test)

predict(tlr_fit, d_test)


tailor documentation built on Aug. 25, 2025, 9:50 a.m.