isodistrreg-package: Isotonic distributional regression (IDR)
In isodistrreg: Isotonic Distributional Regression (IDR)

Description How does it work? The isodistrreg package References Examples

Isotonic distributional Regression (IDR) is a nonparametric method to estimate conditional distributions under monotonicity constraints.

Read the arXiv preprint ‘Isotonic Distributional Regression’ on https://arxiv.org/abs/1909.03725 or by calling browseVignettes(package = "isodistrreg").

To make probabilistic forecasts with IDR,

call idr(y = y, X = X, ...), where y is the response variable (e.g. weather variable observations) and X is a data.frame of covariates (e.g. ensemble forecasts).
use predict(fit, data), where fit is the model fit computed with idr and data is the data based on which you want to make predictions.
Try idrbag for IDR with (su)bagging.

The following pre-defined functions are available to evaluate IDR predictions:

cdf and qpred to compute the cumulative distribution function (CDF) and quantile function of IDR predictions.
bscore and qscore to calculate Brier scores for probability forecasts for threshold exceedance (e.g. probability of precipitation) and quantile scores (e.g. mean absolute error of median forecast.)
crps to compute the continuous ranked probability score (CRPS).
pit to compute the probability integral transform (PIT).
plot to plot IDR predictive CDFs.

Use the dataset rain to test IDR.

Alexander Henzi, Johanna F. Ziegel, and Tilmann Gneiting. Isotonic Distributional Regression. arXiv e-prints, art. arXiv:1909.03725, Sep 2019. URL https://arxiv.org/abs/1909.03725.

## A usage example:

# Prepare dataset: Half of the data as training dataset, other half for validation.
# Consult the R documentation (?rain) for details about the dataset.
data(rain)
trainingData <- subset(rain, date <= "2012-01-09")
validationData <- subset(rain, date > "2012-01-09")

# Variable selection: use HRES and the perturbed forecasts P1, ..., P50
varNames <- c("HRES", paste0("P", 1:50))

# Partial orders on variable groups: Usual order of numbers on HRES (group '1') and
# increasing convex order on the remaining variables (group '2').
groups <- setNames(c(1, rep(2, 50)), varNames)
orders <- c("comp" = 1, "icx" = 2)

# Fit IDR to training dataset.
fit <- idr(
  y = trainingData[["obs"]],
  X = trainingData[, varNames],
  groups = groups,
  orders = orders
)

# Make prediction for the first day in the validation data:
firstPrediction <- predict(fit, data = validationData[1, varNames])
plot(firstPrediction)

# Use cdf() and qpred() to make probability and quantile forecasts:

## What is the probability of precipitation?
1 - cdf(firstPrediction, thresholds = 0)

## What are the predicted 10%, 50% and 90% quantiles for precipitation?
qpred(firstPrediction, quantiles = c(0.1, 0.5, 0.9))

# Make predictions for the complete verification dataset and compare IDR calibrated
# forecasts to the raw ensemble (ENS):
predictions <- predict(fit, data = validationData[, varNames])
y <- validationData[["obs"]]

## Continuous ranked probability score (CRPS):
CRPS <- cbind(
  "ens" = crps(validationData[, varNames], y),
  "IDR" = crps(predictions, y)
)
apply(CRPS, 2, mean)

## Brier score for probability of precipitation:
BS <- cbind(
  "ens" = bscore(validationData[, varNames], thresholds = 0, y),
  "IDR" = bscore(predictions, thresholds = 0, y)
)
apply(BS, 2, mean)

## Quantile score of forecast for 90% quantile:
QS90 <- cbind(
  "ens" = qscore(validationData[, varNames], quantiles = 0.9, y),
  "IDR" = qscore(predictions, quantiles = 0.9, y)
)
apply(QS90, 2, mean)

## Check calibration using (randomized) PIT histograms:
pitEns <- pit(validationData[, varNames], y)
pitIdr <- pit(predictions, y)

hist(pitEns, main = "PIT of raw ensemble forecasts", freq = FALSE)
hist(pitIdr, main = "PIT of IDR calibrated forecasts", freq = FALSE)