transform_forecasts: Transform forecasts and observed values
In epiforecasts/scoringutils: Utilities for Scoring and Assessing Predictions

transform_forecasts

R Documentation

Transform forecasts and observed values

Description

Function to transform forecasts and observed values before scoring.

Usage

transform_forecasts(
  forecast,
  fun = log_shift,
  append = TRUE,
  label = "log",
  ...
)

Arguments

`forecast`	A forecast object (a validated data.table with predicted and observed values).
`fun`	A function used to transform both observed values and predictions. The default function is `log_shift()`, a custom function that is essentially the same as `log()`, but has an additional arguments (`offset`) that allows you add an offset before applying the logarithm. This is often helpful as the natural log transformation is not defined at zero. A common, and pragmatic solution, is to add a small offset to the data before applying the log transformation. In our work we have often used an offset of 1 but the precise value will depend on your application.
`append`	Logical, defaults to `TRUE`. Whether or not to append a transformed version of the data to the currently existing data (`TRUE`). If selected, the data gets transformed and appended to the existing data, making it possible to use the outcome directly in `score()`. An additional column, 'scale', gets created that denotes which rows or untransformed ('scale' has the value "natural") and which have been transformed ('scale' has the value passed to the argument `label`).
`label`	A string for the newly created 'scale' column to denote the newly transformed values. Only relevant if `append = TRUE`.
`...`	Additional parameters to pass to the function you supplied. For the default option of `log_shift()` this could be the `offset` argument.

Details

There are a few reasons, depending on the circumstances, for why this might be desirable (check out the linked reference for more info). In epidemiology, for example, it may be useful to log-transform incidence counts before evaluating forecasts using scores such as the weighted interval score (WIS) or the continuous ranked probability score (CRPS). Log-transforming forecasts and observations changes the interpretation of the score from a measure of absolute distance between forecast and observation to a score that evaluates a forecast of the exponential growth rate. Another motivation can be to apply a variance-stabilising transformation or to standardise incidence counts by population.

Note that if you want to apply a transformation, it is important to transform the forecasts and observations and then apply the score. Applying a transformation after the score risks losing propriety of the proper scoring rule.

Value

A forecast object with either a transformed version of the data, or one with both the untransformed and the transformed data. includes the original data as well as a transformation of the original data. There will be one additional column, ‘scale’, present which will be set to "natural" for the untransformed forecasts.

Author(s)

Nikos Bosse nikosbosse@gmail.com

References

Transformation of forecasts for evaluating predictive performance in an epidemiological context Nikos I. Bosse, Sam Abbott, Anne Cori, Edwin van Leeuwen, Johannes Bracher, Sebastian Funk medRxiv 2023.01.23.23284722 \Sexpr[results=rd]{tools:::Rd_expr_doi("https://doi.org/10.1101/2023.01.23.23284722")} https://www.medrxiv.org/content/10.1101/2023.01.23.23284722v1

Examples

library(magrittr) # pipe operator

# transform forecasts using the natural logarithm
# negative values need to be handled (here by replacing them with 0)
example_quantile %>%
  .[, observed := ifelse(observed < 0, 0, observed)] %>%
  as_forecast_quantile() %>%
# Here we use the default function log_shift() which is essentially the same
# as log(), but has an additional arguments (offset) that allows you add an
# offset before applying the logarithm.
  transform_forecasts(append = FALSE) %>%
  head()

# alternatively, integrating the truncation in the transformation function:
example_quantile %>%
  as_forecast_quantile() %>%
 transform_forecasts(
   fun = function(x) {log_shift(pmax(0, x))}, append = FALSE
 ) %>%
 head()

# specifying an offset for the log transformation removes the
# warning caused by zeros in the data
example_quantile %>%
  as_forecast_quantile() %>%
  .[, observed := ifelse(observed < 0, 0, observed)] %>%
  transform_forecasts(offset = 1, append = FALSE) %>%
  head()

# adding square root transformed forecasts to the original ones
example_quantile %>%
  .[, observed := ifelse(observed < 0, 0, observed)] %>%
  as_forecast_quantile() %>%
  transform_forecasts(fun = sqrt, label = "sqrt") %>%
  score() %>%
  summarise_scores(by = c("model", "scale"))

# adding multiple transformations
example_quantile %>%
  as_forecast_quantile() %>%
  .[, observed := ifelse(observed < 0, 0, observed)] %>%
  transform_forecasts(fun = log_shift, offset = 1) %>%
  transform_forecasts(fun = sqrt, label = "sqrt") %>%
  head()

epiforecasts/scoringutils documentation built on June 11, 2025, 11:29 p.m.