View source: R/metrics-quantile.R
wis | R Documentation |
The WIS is a proper scoring rule used to evaluate forecasts in an interval- / quantile-based format. See Bracher et al. (2021). Smaller values are better.
As the name suggest the score assumes that a forecast comes in the form of one or multiple central prediction intervals. A prediction interval is characterised by a lower and an upper bound formed by a pair of predictive quantiles. For example, a 50% central prediction interval is formed by the 0.25 and 0.75 quantiles of the predictive distribution.
Interval score
The interval score (IS) is the sum of three components: overprediction, underprediction and dispersion. For a single prediction interval only one of the components is non-zero. If for a single prediction interval the observed value is below the lower bound, then the interval score is equal to the absolute difference between the lower bound and the observed value ("underprediction"). "Overprediction" is defined analogously. If the observed value falls within the bounds of the prediction interval, then the interval score is equal to the width of the prediction interval, i.e. the difference between the upper and lower bound. For a single interval, we therefore have:
\textrm{IS} = (\textrm{upper} - \textrm{lower}) + \frac{2}{\alpha}(\textrm{lower}
- \textrm{observed}) *
\mathbf{1}(\textrm{observed} < \textrm{lower}) +
\frac{2}{\alpha}(\textrm{observed} - \textrm{upper}) *
\mathbf{1}(\textrm{observed} > \textrm{upper})
where \mathbf{1}()
is the indicator function and
indicates how much is outside the prediction interval.
\alpha
is the decimal value that indicates how much is outside
the prediction interval. For a 90% prediction interval, for example,
\alpha
is equal to 0.1. No specific distribution is assumed,
but the interval formed by the quantiles has to be symmetric around the
median (i.e you can't use the 0.1 quantile as the lower bound and the 0.7
quantile as the upper bound).
Non-symmetric quantiles can be scored using the function quantile_score()
.
For a set of k = 1, \dots, K
prediction intervals and the median
m
, we can compute a weighted interval score (WIS) as the sum of the
interval scores for individual intervals:
\text{WIS}_{\alpha_{\{0:K\}}}(F, y) = \frac{1}{K + 1/2}
\times \left(w_0 \times |y - m| + \sum_{k=1}^{K}
\left\{ w_k \times \text{IS}_{\alpha_k}(F, y) \right\}\right)
The individual scores are usually weighted with
w_k = \frac{\alpha_k}{2}
. This weight ensures that
for an increasing number of equally spaced quantiles, the WIS
converges to the continuous ranked probability score (CRPS).
Quantile score
In addition to the interval score, there also exists a quantile score (QS)
(see quantile_score()
), which is equal to the so-called pinball loss.
The quantile score can be computed for a single quantile (whereas the
interval score requires two quantiles that form an interval). However,
the intuitive decomposition into overprediction, underprediction and
dispersion does not exist for the quantile score.
Two versions of the weighted interval score
There are two ways to conceptualise the weighted interval score across several quantiles / prediction intervals and the median.
In one view, you would treat the WIS as the average of quantile scores (and
the median as 0.5-quantile) (this is the default for wis()
). In another
view, you would treat the WIS as the average of several interval scores +
the difference between the observed value and median forecast. The effect of
that is that in contrast to the first view, the median has twice as much
weight (because it is weighted like a prediction interval, rather than like
a single quantile). Both are valid ways to conceptualise the WIS and you
can control the behaviour with the count_median_twice
-argument.
WIS components:
WIS components can be computed individually using the functions
overprediction
, underprediction
, and dispersion.
wis(
observed,
predicted,
quantile_level,
separate_results = FALSE,
weigh = TRUE,
count_median_twice = FALSE,
na.rm = FALSE
)
dispersion_quantile(observed, predicted, quantile_level, ...)
overprediction_quantile(observed, predicted, quantile_level, ...)
underprediction_quantile(observed, predicted, quantile_level, ...)
observed |
Numeric vector of size n with the observed values. |
predicted |
Numeric nxN matrix of predictive
quantiles, n (number of rows) being the number of forecasts (corresponding
to the number of observed values) and N
(number of columns) the number of quantiles per forecast.
If |
quantile_level |
Vector of of size N with the quantile levels for which predictions were made. |
separate_results |
Logical. If |
weigh |
Logical. If |
count_median_twice |
If TRUE, count the median twice in the score. |
na.rm |
If TRUE, ignore NA values when computing the score. |
... |
Additional arguments passed on to |
wis()
: a numeric vector with WIS values of size n (one per observation),
or a list with separate entries if separate_results
is TRUE
.
dispersion_quantile()
: a numeric vector with dispersion values (one per
observation).
overprediction_quantile()
: a numeric vector with overprediction values
(one per observation).
underprediction_quantile()
: a numeric vector with underprediction values
(one per observation)
Overview of required input format for quantile-based forecasts
Evaluating epidemic forecasts in an interval format, Johannes Bracher, Evan L. Ray, Tilmann Gneiting and Nicholas G. Reich, 2021, https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008618
observed <- c(1, -15, 22)
predicted <- rbind(
c(-1, 0, 1, 2, 3),
c(-2, 1, 2, 2, 4),
c(-2, 0, 3, 3, 4)
)
quantile_level <- c(0.1, 0.25, 0.5, 0.75, 0.9)
wis(observed, predicted, quantile_level)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.