score: Evaluate forecasts

View source: R/score.R

scoreR Documentation

Evaluate forecasts

Description

score() applies a selection of scoring metrics to a forecast object (a data.table with forecasts and observations) as produced by as_forecast(). score() is a generic that dispatches to different methods depending on the class of the input data.

See the details section for more information on forecast types and input formats. For additional help and examples, check out the Getting Started Vignette as well as the paper Evaluating Forecasts with scoringutils in R.

Usage

score(forecast, metrics, ...)

## S3 method for class 'forecast_binary'
score(forecast, metrics = metrics_binary(), ...)

## S3 method for class 'forecast_point'
score(forecast, metrics = metrics_point(), ...)

## S3 method for class 'forecast_sample'
score(forecast, metrics = metrics_sample(), ...)

## S3 method for class 'forecast_quantile'
score(forecast, metrics = metrics_quantile(), ...)

Arguments

forecast

A forecast object (a validated data.table with predicted and observed values, see as_forecast())

metrics

A named list of scoring functions. Names will be used as column names in the output. See metrics_point(), metrics_binary(), metrics_quantile(), and metrics_sample() for more information on the default metrics used. Note that if you want to pass arguments to any given metric, you should do that through the function customise_metric() and pass an updated list of functions with your custom metric to the metrics argument in score().

...

Additional arguments. Currently unused but allows for future extensions. If you want to pass arguments to individual metrics, use customise_metric().

Value

An object of class scores. This object is a data.table with unsummarised scores (one score per forecast) and has an additional attribute metrics with the names of the metrics used for scoring. See summarise_scores()) for information on how to summarise scores.

Forecast types and input formats

Various different forecast types / forecast formats are supported. At the moment, those are:

  • point forecasts

  • binary forecasts ("soft binary classification")

  • Probabilistic forecasts in a quantile-based format (a forecast is represented as a set of predictive quantiles)

  • Probabilistic forecasts in a sample-based format (a forecast is represented as a set of predictive samples)

Forecast types are determined based on the columns present in the input data. Here is an overview of the required format for each forecast type:

required-inputs.png

All forecast types require a data.frame or similar with columns observed predicted, and model.

Point forecasts require a column observed of type numeric and a column predicted of type numeric.

Binary forecasts require a column observed of type factor with exactly two levels and a column predicted of type numeric with probabilities, corresponding to the probability that observed is equal to the second factor level. See details here for more information.

Quantile-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column quantile_level of type numeric with quantile-levels (between 0 and 1).

Sample-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column sample_id of type numeric with sample indices.

For more information see the vignettes and the example data (example_quantile, example_sample_continuous, example_sample_discrete, example_point(), and example_binary).

Forecast unit

In order to score forecasts, scoringutils needs to know which of the rows of the data belong together and jointly form a single forecasts. This is easy e.g. for point forecast, where there is one row per forecast. For quantile or sample-based forecasts, however, there are multiple rows that belong to single forecast.

The forecast unit or unit of a single forecast is then described by the combination of columns that uniquely identify a single forecast. For example, we could have forecasts made by different models in various locations at different time points, each for several weeks into the future. The forecast unit could then be described as forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). scoringutils automatically tries to determine the unit of a single forecast. It uses all existing columns for this, which means that no columns must be present that are unrelated to the forecast unit. As a very simplistic example, if you had an additional row, "even", that is one if the row number is even and zero otherwise, then this would mess up scoring as scoringutils then thinks that this column was relevant in defining the forecast unit.

In order to avoid issues, we recommend setting the forecast unit explicitly, usually through the forecast_unit argument in as_forecast(). This will drop unneeded columns, while making sure that all necessary, 'protected columns' like "predicted" or "observed" are retained.

Author(s)

Nikos Bosse nikosbosse@gmail.com

References

Bosse NI, Gruson H, Cori A, van Leeuwen E, Funk S, Abbott S (2022) Evaluating Forecasts with scoringutils in R. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.48550/arXiv.2205.07090")}

Examples

library(magrittr) # pipe operator


validated <- as_forecast(example_quantile)
score(validated) %>%
  summarise_scores(by = c("model", "target_type"))

# set forecast unit manually (to avoid issues with scoringutils trying to
# determine the forecast unit automatically)
example_quantile %>%
  as_forecast(
    forecast_unit = c(
      "location", "target_end_date", "target_type", "horizon", "model"
    )
  ) %>%
  score()

# forecast formats with different metrics
## Not run: 
score(as_forecast(example_binary))
score(as_forecast(example_quantile))
score(as_forecast(example_point))
score(as_forecast(example_sample_discrete))
score(as_forecast(example_sample_continuous))

## End(Not run)

epiforecasts/scoringutils documentation built on May 9, 2024, 12:52 a.m.