View source: R/adjust-probability-calibration.R
adjust_probability_calibration | R Documentation |
Calibration is the process of adjusting a model's outputted probabilities to match the observed frequencies of events. This technique aims to ensure that when a model predicts a certain probability for an outcome, that probability accurately reflects the true likelihood of that outcome occurring.
adjust_probability_calibration(x, method = NULL, ...)
x |
A |
method |
Character. One of |
... |
Optional arguments to pass to the corresponding function in the probably package. These arguments must be named. |
The "logistic" and "multinomial" methods fit models that predict the observed
classes as a function of the predicted class probabilities. These models
remove any overt systematic trends from the linear predictor and correct new
predictions. The underlying code fits that model using mgcv::gam()
.
If smooth = FALSE
is passed to the ...
, it uses stats::glm()
for binary
outcomes or nnet::multinom()
for 3+ classes.
The isotonic method uses stats::isoreg()
to force the predicted
probabilities to increase with the observed outcome class. This creates a
step function that will map new predictions to values that are monotonically
increasing with the binary (0/1) form of the outcome. One side effect is
that there are fewer, perhaps far fewer, unique predicted probabilities.
For 3+ classes, this is done using a one-versus-all strategy that ensures
that the probabilities add to 1.0. The "isotonic boot" method resamples the
data and generates multiple isotonic regressions that are averaged and used
to correct the predictions. This may not be perfectly monotonic, but the
number of unique calibrated predictions increases with the number of
bootstrap samples (controlled by passing the times
argument to ...
).
Beta calibration (Kull et al, 2017) assumes that the probability estimates
follow a Beta distribution. This leads to a sigmoidal model that can be fit
to the data via maximum likelihood. There are a few different ways to fit
the model; see betacal:: beta_calibration()
options parameters
to select
a specific sigmoidal model.
An updated tailor()
containing the new operation.
This adjustment requires estimation and, as such, different subsets of data should be used to train it and evaluate its predictions.
Note that, when calling fit.tailor()
, if the calibration data have zero or
one row, the method
is changed to "none"
.
Kull, Meelis, Telmo Silva Filho, and Peter Flach. "Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers." Artificial intelligence and statistics. PMLR, 2017.
https://aml4td.org/chapters/cls-metrics.html#sec-cls-calibration
library(modeldata)
# split example data
set.seed(1)
in_rows <- sample(c(TRUE, FALSE), nrow(two_class_example), replace = TRUE)
d_calibration <- two_class_example[in_rows, ]
d_test <- two_class_example[!in_rows, ]
head(d_calibration)
# specify calibration
tlr <-
tailor() |>
adjust_probability_calibration(method = "logistic")
# train tailor on a subset of data.
tlr_fit <- fit(
tlr,
d_calibration,
outcome = c(truth),
estimate = c(predicted),
probabilities = c(Class1, Class2)
)
# apply to predictions on another subset of data
head(d_test)
predict(tlr_fit, d_test)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.