calibrate: External probability calibration

View source: R/calibrate.R

calibrateR Documentation

External probability calibration

Description

Validates predicted probabilities against a set of observed (binary) outcomes.

Usage

calibrate(
  prob,
  y,
  method = c("pratt", "iso", "ns", "bins"),
  pos.class = NULL,
  probs = c(0.05, 0.35, 0.65, 0.95),
  nbins = 10
)

## S3 method for class 'calibrate'
print(x, ...)

## S3 method for class 'calibrate'
plot(
  x,
  refline = TRUE,
  refline.col = 2,
  refline.lty = "dashed",
  refline.lwd = 1,
  ...
)

Arguments

prob

Vector of predicted probabilities.

y

Vector of binary (i.e., 0/1) outcomes. If y is coded as anything other than 0/1, then you must specify which of the two categories represents the "positive" class (i.e., the class for which the probabilities specified in prob correspond to) via the pos.class argument.

method

Character string specifying which calibration method to use. Current options include:

"pratt"

Pratt scaling.

"iso"

Isotonic (i.e., monotonic) calibration.

"ns"

Natural (i.e., restricted) cubic splines; essentially, a spline-based nonparametric version of Pratt scaling.

"binned"

Use binning to discretize the probabilities into bins (i.e., no model).

pos.class

Numeric/character string specifying which values in y correspond to the "positive" class. Default is NULL. (Must be specified whenever y is not coded as 0/1., where 1 is assumed to represent the "positive" class.)

probs

Numeric vector specifying the probabilities for generating the quantiles of prob on the logit scale; these are used for the knot locations defining the spline whenever method = "ns". The default corresponds to a good choice based on four knots; see Harrel (2015, pp. 26-28) for details.

nbins

Integer specifying the number of bins to use for grouping the probabilities; only used if method = "binned".

x

An object of class "calibrate".

...

Additional optional argument to be passed on to other methods.

refline

Logical indicating whether or not to include a reference line.

refline.col

The color to use for the reference line. Default is "red".

refline.lty

The type of line to use for the reference line. Default is "dashed".

refline.lwd

The width of the reference line. Default is 1.

Value

A "calibrate" object, which is essentially a list with the following components:

"probs"

A data frame containing two columns: original (the original probability estimates) and calibrated (the calibrated probability estimates).

"calibrater"

The calibration function (essentially a fitted model object) which can be used to calibrate new probabilities.

"bs"

The Brier score between prob and y.

References

Harrell, Frank. (2015). Regression Modeling Strategies. Springer Series in Statistics. Springer International Publishing.


bgreenwell/treemisc documentation built on Oct. 26, 2022, 12:56 a.m.