summarise_scores: Summarise scores as produced by 'score()'

View source: R/summarise_scores.R

summarise_scoresR Documentation

Summarise scores as produced by score()

Description

Summarise scores as produced by score()

Usage

summarise_scores(
  scores,
  by = NULL,
  fun = mean,
  relative_skill = FALSE,
  relative_skill_metric = "auto",
  metric = deprecated(),
  baseline = NULL,
  ...
)

summarize_scores(
  scores,
  by = NULL,
  fun = mean,
  relative_skill = FALSE,
  relative_skill_metric = "auto",
  metric = deprecated(),
  baseline = NULL,
  ...
)

Arguments

scores

A data.table of scores as produced by score().

by

character vector with column names to summarise scores by. Default is NULL, meaning that the only summary that takes is place is summarising over samples or quantiles (in case of quantile-based forecasts), such that there is one score per forecast as defined by the unit of a single forecast (rather than one score for every sample or quantile). The unit of a single forecast is determined by the columns present in the input data that do not correspond to a metric produced by score(), which indicate indicate a grouping of forecasts (for example there may be one forecast per day, location and model). Adding additional, unrelated, columns may alter results in an unpredictable way.

fun

a function used for summarising scores. Default is mean.

relative_skill

logical, whether or not to compute relative performance between models based on pairwise comparisons. If TRUE (default is FALSE), then a column called 'model' must be present in the input data. For more information on the computation of relative skill, see pairwise_comparison(). Relative skill will be calculated for the aggregation level specified in by.

relative_skill_metric

character with the name of the metric for which a relative skill shall be computed. If equal to 'auto' (the default), then this will be either interval score, CRPS or Brier score (depending on which of these is available in the input data)

metric

[Deprecated] Deprecated in 1.1.0. Use relative_skill_metric instead.

baseline

character string with the name of a model. If a baseline is given, then a scaled relative skill with respect to the baseline will be returned. By default (NULL), relative skill will not be scaled with respect to a baseline model.

...

additional parameters that can be passed to the summary function provided to fun. For more information see the documentation of the respective function.

Examples

data.table::setDTthreads(1) # only needed to avoid issues on CRAN
library(magrittr) # pipe operator

scores <- score(example_continuous)
summarise_scores(scores)


# summarise over samples or quantiles to get one score per forecast
scores <- score(example_quantile)
summarise_scores(scores)

# get scores by model
summarise_scores(scores, by = c("model"))

# get scores by model and target type
summarise_scores(scores, by = c("model", "target_type"))

# get standard deviation
summarise_scores(scores, by = "model", fun = sd)

# round digits
summarise_scores(scores, by = c("model")) %>%
  summarise_scores(fun = signif, digits = 2)

# get quantiles of scores
# make sure to aggregate over ranges first
summarise_scores(scores,
  by = "model", fun = quantile,
  probs = c(0.25, 0.5, 0.75)
)

# get ranges
# summarise_scores(scores, by = "range")

scoringutils documentation built on Feb. 16, 2023, 7:30 p.m.