scores_sample_univ_weighted: Weighted Scoring Rules for Simulated Forecast Distributions

scores_sample_univ_weightedR Documentation

Weighted Scoring Rules for Simulated Forecast Distributions

Description

Calculate weighted scores given observations and draws from univariate predictive distributions. The weighted scoring rules that are available are the threshold-weighted CRPS, outcome-weighted CRPS, and conditional and censored likelihood scores.

Usage

twcrps_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  chain_func = NULL,
  w = NULL,
  show_messages = TRUE
)

owcrps_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  weight_func = NULL,
  w = NULL,
  show_messages = TRUE
)

clogs_sample(
  y,
  dat,
  a = -Inf,
  b = Inf,
  bw = NULL,
  show_messages = FALSE,
  cens = TRUE
)

Arguments

y

vector of realized values.

dat

vector or matrix (depending on y; see details) of simulation draws from forecast distribution.

a

numeric lower bound for the indicator weight function w(z) = 1{a < z < b}.

b

numeric upper bound for the indicator weight function w(z) = 1{a < z < b}.

chain_func

function used to target particular outcomes in the threshold-weighted CRPS; the default corresponds to the weight function w(z) = 1{a < z < b}.

w

optional; vector or matrix (matching dat) of ensemble weights. Note that these weights are not used in the weighted scoring rules; see details.

show_messages

logical; display of messages (does not affect warnings and errors).

weight_func

function used to target particular outcomes in the outcome-weighted CRPS; the default corresponds to the weight function w(z) = 1{a < z < b}.

bw

optional; vector (matching y) of bandwidths for kernel density estimation for clogs_sample; see details.

cens

logical; if TRUE, clogs_sample returns the censored likelihood score; if FALSE, clogs_sample returns the conditional likelihood score.

Details

For a vector y of length n, dat should be given as a matrix with n rows. If y has length 1, then dat may be a vector.

twcrps_sample transforms y and dat using the chaining function chain_func and then calls crps_sample. owcrps_sample weights y and dat using the weight function weight_func and then calls crps_sample. See the documentation for crps_sample for further details.

The default weight function used in the weighted scores is w(z) = 1{a < z < b}, which is equal to one if z is between a and b, and zero otherwise. This weight function emphasises outcomes between a and b, and is commonly used in practical applications when interest is on values above a threshold (set b = Inf and a equal to the threshold) or below a threshold (set a = -Inf and b equal to the threshold).

Alternative weight functions can also be employed using the chain_func and weight_func arguments to twcrps_sample and owcrps_sample, respectively. Computation of the threshold-weighted CRPS for samples from a predictive distribution requires a chaining function rather than a weight function. This is why a chaining function is an input for twcrps_sample whereas a weight function is an input for owcrps_sample. Since clogs_sample requires kernel density estimation to approximate the forecast density, it cannot readily be calculated for arbitrary weight functions, and is thus only available for the canonical weight function w(z) = 1{a < z < b}.

The chain_func and weight_func arguments are functions that will be applied to the vector y and the columns of dat. It is assumed that these functions are vectorised. Both functions must take a vector as an input and output a vector of the same length, containing the weight (for weight_func) or transformed value (for chain_func) corresponding to each element in the input vector. An error will be returned if weight_func returns negative values, and a warning message will appear if chain_func is not increasing.

If no custom argument is given for a, b, chain_func or weight_func, then both twcrps_sample and owcrps_sample are equivalent to the standard unweighted crps_sample, and clogs_sample is equivalent to logs_sample.

The w argument is also present in the unweighted scores (e.g. crps_sample). w is used to weight the draws from the predictive distribution, and does not weight particular outcomes within the weighted scoring rules. This should not be confused with the weight_func argument, which is used within the weighted scores.

Value

Value of the score. A lower score indicates a better forecast.

Author(s)

Sam Allen

References

Allen, S. (2024): ‘Weighted scoringRules: Emphasising Particular Outcomes when Evaluating Probabilistic Forecasts’, Journal of Statistical Software. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v110.i08")}

Threshold-weighted CRPS:

Gneiting, T. and R. Ranjan (2011): ‘Comparing density forecasts using threshold-and quantile-weighted scoring rules’, Journal of Business & Economic Statistics 29, 411-422. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1198/jbes.2010.08110")}

Allen, S., Ginsbourger, D. and J. Ziegel (2023): ‘Evaluating forecasts for high-impact events using transformed kernel scores’, SIAM/ASA Journal on Uncertainty Quantification 11, 906-940. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1137/22M1532184")}

Outcome-weighted CRPS:

Holzmann, H. and B. Klar (2017): ‘Focusing on regions of interest in forecast evaluation’, Annals of Applied Statistics 11, 2404-2431. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/17-AOAS1088")}

Conditional and censored likelihood scores:

Diks, C., Panchenko, V. and D. Van Dijk (2011): ‘Likelihood-based scoring rules for comparing density forecasts in tails’, Journal of Econometrics 163, 215-230. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jeconom.2011.04.001")}

See Also

scores_sample_univ for standard (unweighted) scores based on simulated forecast distributions. scores_sample_multiv_weighted for weighted scores based on simulated multivariate forecast distributions.

Examples

## Not run: 

y <- rnorm(10)
sample_fc <- matrix(rnorm(100), nrow = 10)

crps_sample(y = y, dat = sample_fc)
twcrps_sample(y = y, dat = sample_fc)
owcrps_sample(y = y, dat = sample_fc)

logs_sample(y = y, dat = sample_fc)
clogs_sample(y = y, dat = sample_fc)
clogs_sample(y = y, dat = sample_fc, cens = FALSE)

# emphasise outcomes above 0
twcrps_sample(y = y, dat = sample_fc, a = 0)
owcrps_sample(y = y, dat = sample_fc, a = 0)
clogs_sample(y = y, dat = sample_fc, a = 0)
clogs_sample(y = y, dat = sample_fc, a = 0, cens = FALSE)

# emphasise outcomes below 0
twcrps_sample(y = y, dat = sample_fc, b = 0)
owcrps_sample(y = y, dat = sample_fc, b = 0)
clogs_sample(y = y, dat = sample_fc, b = 0) 

# emphasise outcomes between -1 and 1
twcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)
owcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)
clogs_sample(y = y, dat = sample_fc, a = -1, b = 1)


# a must be smaller than b 
twcrps_sample(y = y, dat = sample_fc, a = 1, b = -1) # error
owcrps_sample(y = y, dat = sample_fc, a = 0, b = 0) # error
clogs_sample(y = y, dat = sample_fc, a = 10, b = 9) # error

# a and b must be single numeric values (not vectors)
twcrps_sample(y = y, dat = sample_fc, a = rnorm(10)) # error


# the owCRPS is not well-defined if none of dat are between a and b
y <- rnorm(10)
sample_fc <- matrix(runif(100, -5, 1), nrow = 10)
owcrps_sample(y = y, dat = sample_fc, a = 1)
# the twCRPS is zero if none of y and dat are between a and b
twcrps_sample(y = y, dat = sample_fc, a = 1) 


# alternative custom weight and chaining functions can also be used

# Example 1: a Gaussian weight function with location mu and scale sigma
mu <- 0
sigma <- 0.5
weight_func <- function(x) pnorm(x, mu, sigma)
# or weight_func <- get_weight_func("norm_cdf", mu, sigma)
# a corresponding chaining function is
chain_func <- function(x) (x - mu)*pnorm(x, mu, sigma) + (sigma^2)*dnorm(x, mu, sigma)
# or chain_func <- get_weight_func("norm_cdf", mu, sigma, weight = FALSE)

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l") # positive outcomes are given higher weight
plot(x, chain_func(x), type = "l") 

owcrps_sample(y = y, dat = sample_fc, a = mu)
owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, a = mu)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)


# Example 2: a sigmoid (or logistic) weight function with location mu and scale sigma
weight_func <- function(x) plogis(x, mu, sigma)
# or weight_func <- get_weight_func("logis_cdf", mu, sigma)
chain_func <- function(x) sigma*log(exp((x - mu)/sigma) + 1)
# or chain_func <- get_weight_func("logis_cdf", mu, sigma, weight = FALSE)

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l") # positive outcomes are given higher weight
plot(x, chain_func(x), type = "l") 

owcrps_sample(y = y, dat = sample_fc, a = mu)
owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, a = mu)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)


# Example 3: the weight function w(z) = 1{z < a or z > b}
a <- -1
b <- 1
weight_func <- function(x) as.numeric(x < a | x > b)
chain_func <- function(x) (x < a)*(x - a) + (x > b)*(x - b) + a

x <- seq(-2, 2, 0.01)
plot(x, weight_func(x), type = "l")
plot(x, chain_func(x), type = "l")

owcrps_sample(y = y, dat = sample_fc, weight_func = weight_func)
twcrps_sample(y = y, dat = sample_fc, chain_func = chain_func)
twcrps_sample(y = y, dat = sample_fc, b = -1) + twcrps_sample(y = y, dat = sample_fc, a = 1)
crps_sample(y = y, dat = sample_fc) - twcrps_sample(y = y, dat = sample_fc, a = -1, b = 1)

## End(Not run)


scoringRules documentation built on Sept. 18, 2024, 5:09 p.m.