scoring: Score Individual Quantiles

Description Usage Arguments Details Value Note References

Description

Score quantile forecasts against realizations.

Usage

1
2
3
4
5
scoreq(y, yhat, tau, w = 1, g = identity, cmb = TRUE, wtau = NULL,
  na_omit = TRUE)

score_eval(y, yhat, sc, w = 1, cmb = TRUE, na_omit = TRUE,
  se = FALSE)

Arguments

y

Vector of realizations

yhat

Forecast matrix. Each row should be a forecast (of quantiles) with corresponding realization in y; columns represent the level/index of the quantile (corresponding to tau).

tau

Vector of quantile indices of the quantiles.

w

Vector of weights corresponding to the observations in yhat. No need to normalize them. See details to see exactly how these weights are used.

g

Increasing univariate vectorized function to transform y and yhat with (see computation details in the "details" section). Default is the identity.

cmb

Logical; should the scores be combined (via average)? TRUE if so, FALSE to output a score matrix for each observation (rows) and each quantile level (column).

wtau

Function that accepts a vector of quantile indices and returns an equally lengthed vector of weights to multiply the corresponding individual quantile scores by. NULL for equal weights.

na_omit

Logical; should observations leading to an NA score (for any tau) be removed? TRUE by default. Warning message appears when observations are removed.

sc

The scoring rule to use, as in the output of the function scorer.

se

Logical; should an estimate of the standard error of the mean estimate be returned as well? TRUE if so (which also overrides the cmb argument, which is taken to be TRUE).

Details

scoreq is deprecated. It doesn't allow for the computation of standard error, and if asked to return the matrix of scores, it would incorporate the across-observation weights into the scores, whereas score_eval does not.

Here's how the score for the i'th observation and the k'th quantile forecast for that observation is computed:

wtau(τ_k) (τ_k - I(y<yhat_{ik}))(g(y) - g(yhat_{ik})),

which is a proper scoring rule as shown in Gneiting and Raftery (2007).

To get a score for a particular observation, the average (not the sum) is taken for each row. The scores aren't summed, so that the score doesn't tend to infinity as we include more and more quantiles. Also, the across-quantile weights, determined by the function

wtau

, are not normalized, so that the individual scores don't tend to 0 as more quantiles are included.

Value

If se is TRUE, returns a named vector of length two of the average score (weighted by argument w) and standard error of the average. The standard error is estimated by assuming iid scores, and is the standard deviation of the scores times the root sum of squares of the normalized weights w.

Here's what is output if se is FALSE (always the case with the deprecated scoreq function). If cmb is FALSE, returns the score matrix (see details for how each score is computed) (rows correspond to observations, and columns correspond to quantile indices tau). Otherwise, a single numeric score is combined that is the average of the score matrix.

Note

You could consider having the transformation function g transform each observation differentially, by forcing it to accept a vector of length equal to your data. This is useful to add seasonal trends, for example.

References


vincenzocoia/cmc documentation built on Nov. 18, 2019, 12:04 a.m.