knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/", out.width = "100%" )
An R package providing user-friendly univariate and multivariate verification tools for the statistical ensemble post-processing. It allows to score and assess the calibration (reliability) and sharpness of ensemble forecasts and predictive distributions. In addition this package can be used to create useful contemporary visualizations for verification.
You can install the latest development version from GitHub with:
#install.packages("remotes") remotes::install_github("jobstdavid/eppverification")
The goal of probabilistic forecasting is to maximize the sharpness of the probabilistic forecast F (CDF) subject to calibration. Therefore this package contains tools for assessing:
In the following you find univariate and multivariate verification tools for calibration, sharpness and proper scoring rules, where the corresponding function provided within this package is written in brackets.
vr.hist
), Reliability Index (ri
), Entropy (ent
), PIT Histogram (pit.hist
), Central Prediction Interval Coverage (cpi
). rmv
), Central Prediction Interval Width (cpi
).var.pit
).crps
), Logarithmic Score (logs
), Interval Score (is
), Quantile Score (qs
), Brier Score (bs
), Dawid-Sebastiani Score (dss
), Absolute Error (ae
), Squared Error (se
).mvr.hist
), Multivariate Reliability Index (mri
), Multivariate Entropy (ment
).ds
).logs
), Energy Score (es
), Euclidean Error (ee
), Variogram Score (vs
).Further functions for model comparison and visualizations in this package are:
dm.test
: This function performs a Diebold-Mariano-Test for two forecasts.bh.test
: This function performs a Benjamini-Hochberg-Correction for different p-values.ss
: This function calculates the skill score. cpi.plot
: This function plots the central prediction intervals for a certain interval range.cov.plot
: This function plots the central prediction interval coverage for certain interval ranges of different models. score.plot
: This function plots scores of different models rated by selected measures as heatmap.# load R package library(eppverification) # set.seed for reproducibility set.seed(2023)
# simulated data n <- 30 m <- 50 y <- rnorm(n) x <- matrix(rnorm(n*m), ncol = m) # Verification Rank Histogram vr.hist(y = y, x = x, bins = 3, reliability = TRUE, entropy = TRUE)
# simulated data n <- 10000 u <- runif(n) # PIT Histogram pit.hist(u = u, bins = 5, dispersion = TRUE, bias = TRUE)
# Root Mean Variance rmv(x = x)
# simulated data n <- 30 m <- 50 y <- rnorm(n) x <- matrix(rnorm(n*m), ncol = m) # Continuous Ranked Probability Score crps(y = y, x = x, method = "ens", aggregate = mean)
# simulated data n <- 30 y <- rnorm(n, mean = 1:n) nominal.coverage <- 90 alpha <- (100-nominal.coverage)/100 lower <- qnorm(alpha/2, rnorm(n, mean = 1:n)) upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n)) # Central Prediction Interval Values cpi(y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage, separate = c("is", "overprediction", "underprediction", "width", "coverage"), aggregate = mean)
# simulated data n <- 30 m <- 50 y <- cbind(rnorm(n), rgamma(n, shape = 1)) x <- array(NA, dim = c(m, 2, n)) x[, 1, ] <- rnorm(n*m) x[, 2, ] <- rgamma(n*m, shape = 1) # Multivariate Verification Rank Histogram mvr.hist(y = y, x = x, method = "mv", type = "absolute", bins = 17)
# simulated data n <- 30 m <- 50 x <- array(NA, dim = c(2, 2, n)) for (i in 1:n) { x[, , i] <- cov(cbind(rnorm(m), rgamma(m, shape = 1))) } #Determinant Sharpness ds(x = x, covmat = TRUE, aggregate = mean)
# simulated data n <- 30 m <- 50 y <- cbind(rnorm(n), rgamma(n, shape = 1)) x <- array(NA, dim = c(m, 2, n)) x[, 1, ] <- rnorm(n*m) x[, 2, ] <- rgamma(n*m, shape = 1) # Energy Score es(y = y, x = x, method = "ens", aggregate = mean) # Euclidean Error ee(y = y, x = x, method = "median", aggregate = mean)
# simulated data n <- 365 s1 <- arima.sim(list(ar = 0.7), sd = 0.5, 100) s2 <- arima.sim(list(ar = 0.7), sd = 0.5, 100) - 0.2 p <- runif(100, min = 0, max = 0.05) # Diebold-Mariano-Test dm.test(s1, s2, alternative = "two.sided", h = 1) # Benjamini-Hochberg-Procedure bh.test(p, alpha = 0.05)
# simulated data n <- 30 x <- seq(Sys.Date(), by = "day", length.out = n) y <- rnorm(n, mean = 1:n) nominal.coverage <- 90 alpha <- (100-nominal.coverage)/100 lower <- qnorm(alpha/2, rnorm(n, mean = 1:n)) upper <- qnorm((1-alpha/2), rnorm(n, mean = 1:n)) # Central Prediction Intervals Plot cpi.plot(x = x, y = y, lower = lower, upper = upper, nominal.coverage = nominal.coverage, x.lab = "Date", y.lab = "Value", info = TRUE)
# simulated data n <- 30 x <- matrix(runif(n)*100, ncol = 3) x <- apply(x, 2, sort) nominal.coverage <- seq(5, 95, length.out = 10) models <- c("A", "B", "C") # Central Prediction Interval Coverage Model Comparison cov.plot(x = x, models = models, nominal.coverage = nominal.coverage)
# simulated data x <- matrix(c(0.5, 0.3, 0.8, 0.21, 1.5, 0.7, 2, 1), byrow = TRUE, ncol = 4) models <- c("A", "B", "C", "D") measures <- c("CRPS", "LogS") # Score Plot score.plot(x = x, models = models, measures = measures)
Feel free to contact jobstd@uni-hildesheim.de if you have any questions or suggestions.
Gneiting, T. and Raftery, A. (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association. 102(477). 359-378.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.