intervalScore: Interval score, a measure of forecast quality
In StatisticsNZ/dembase: Analysing Cross-Classified Data about Populations

Description Usage Arguments Details Value References See Also Examples

The interval score is a way of measuring the quality of a probabilistic forecast. Lowers scores imply higher quality.

intervalScore(interval, truth)

## S4 method for signature 'DemographicArray,DemographicArray'
intervalScore(interval, truth)

## S4 method for signature 'DemographicArray,numeric'
intervalScore(interval, truth)

`interval`	A `DemographicArray`, with a quantile dimension of length 2.
`truth`	A `DemographicArray`, or a single number.

The score is generally calculated by holding back some data from a forecasting model, doing forecasts, and then comparing the forecasted values with the held-back data. In other words, the data are divided into a training set and a test set, the training set is used to do the forecasts, and the interval scores are calculated from the forecasts and test set.

Interval scores reward accuracy and narrow prediction intervals. They equal the width of the prediction interval plus penalties for being outside the interval. For the details, see Section 6.2 of the reference below.

interval holds the prediction intervals from the forecasts. It must have a dimension with dimtype "quantile". There here can be only two quantiles, and these quantiles must be symmetric (for instance, 5% and 95% is a valid pair, but 5% and 90% is not.)

The return value contains a score for each quantity being predicted. These scores can be aggregated using sum or collapseDimension to, for instance, give an overall score, or a score for each time period.

A Counts object with the same dimensions as truth, or a single number.

Gneiting T and Raftery A. 2007. Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association. 102(477): 539-578.

MSE is a standard measure of the accuracy of point estimates (as opposed to prediction intervals.)

interval <- Values(array(c(23, 25, 21, 28,
                           18, 16, 22, 23),
                         dim = c(2, 4),
                         dimnames = list(quantile = c("10%", "90%"),
                                         year = 2011:2014)),
                   dimscales = c(year = "Points"))
truth <- ValuesOne(c(22, 23, 24, 25),
                   labels = 2011:2014,
                   name = "year",
                   dimscale = "Points")
interval
truth
score <- intervalScore(interval = interval,
                       truth = truth)
## high score in 2013, since value outside interval
## (recalling that a high score implies low quality)
score
## scores can be summed, eg to give an overall score
sum(score)

## single number
interval <- Values(array(c(-1, 1),
                         dim = 2,
                         dimnames = list(quantile = c("10%", "90%"))))
truth <- 0.3
interval
truth
intervalScore(interval = interval,
              truth = truth)