Description Usage Arguments Details Value Note Author(s) References See Also Examples
Given parameters of a scoring rule family, calculate scores for probabilistic forecasts and associated outcomes.
1 2 3 4 5 6 7 8 
object 
an object of class "formula", of the form

outcome 
a vector of outcomes (used if 
fam 
scoring rule family. 
param 
for family 
data 
an optional data frame or list containing the
variables in the formula. If not found in 
bounds 
a vector of length 2 corresponding to the desired
minimum value and maximum value of the scoring rule, respectively. Entries of 
reverse 
if 
ordered 
if 
... 
Additional arguments. 
The formula is of the form outcome ~ forecast
, where forecast
describes the column(s) containing forecasts associated with the possible outcomes. Multiple columns are separated by +
. outcome
is always a vector describing the outcome associated with each forecast. It should be coded 1, 2, ..., reflecting the column associated with the outcome (see examples).
For events with only two alternatives, one can take a shortcut and
supply only forecasts associated with a single outcome (if baseline
parameters are specified for families pow
and sph
, the
parameter for only that outcome should also be supplied). In this case, the outcome
vector should contain zeros and ones, where βoneβ means that the forecasted alternative occurred.
If ordered=TRUE
, an "ordered" scoring rule is obtained using the
strategy proposed by Jose, Nau, & Winkler (2009). These ordered rules
are only useful when the number of forecasted alternatives is greater
than two (i.e., when one uses family pow
or sph
).
If baseline parameters are not supplied for families pow
or
sph
, then the parameters are taken to be equal across all
alternatives (though the natural scaling of the scoring rule differs
depending on whether or not one explicitly supplies equal baseline
parameters).
If desired, a unique scoring rule can be applied to each row of the
forecast matrix: the param
argument can be supplied as a matrix.
When the bounds
argument is supplied, the code attempts to scale the scores so that the maximum score is bounds[2]
and the minimum score is bounds[1]
. This scaling cannot be accomplished when the scoring rule allows scores of infinity (the log score is the most common case here). If reverse=TRUE
, the bounds are applied after the reversal (so that the supplied lower bound reflects the worst score and upper bound reflects the best score).
calcscore
returns a numeric vector that has length equal to
length(outcome)
, containing scores under the selected scoring rule.
The beta family was originally proposed by Buja et al.\ (2005); the power and pseudospherical families with baseline are described by Jose et al.\ (2009). A discussion of choosing specific rules from these families is provided by Merkle and Steyvers (2013).
Some notable special cases of these families are:
Beta family: Log score when parameters are (0,0); Brier score when parameters are (1,1).
Power family with baseline parameters all equal (to 1/(number of alternatives)): The family approaches the log score as gamma goes to 1 (but the family is undefined for gamma=1). The Brier score is obtained for gamma=2.
Pseudospherical family with baseline parameters all equal: The family approaches the log score as gamma goes to 1 (but the family is undefined for gamma=1). The spherical score is obtained for gamma=2.
Ed Merkle
Buja, A., Stuetzle, W., & Shen, Y. (2005). Loss functions for binary class probability estimation and classification: Structure and applications. (Obtained from http://stat.wharton.upenn.edu/~buja/PAPERS/)
Jose, V. R. R., Nau, R. F., & Winkler, R. L. (2008). Scoring rules, generalized entropy, and utility maximization. Operations Research, 56, 1146β1157.
Jose, V. R. R., Nau, R. F., & Winkler, R. L. (2009). Sensitivity to distance and baseline distributions in forecast evaluation. Management Science, 55, 582β590.
Merkle, E. C. & Steyvers, M. (in press). Choosing a strictly proper scoring rule. Decision Analysis.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64  ## Brier scores for two alternatives, with bounds of 0 and 1
data("WorldEvents")
scores < calcscore(answer ~ forecast, fam="beta",
param=c(1,1), data=WorldEvents,
bounds=c(0,1))
## Calculate Brier scores manually
scores.man < with(WorldEvents, (forecast  answer)^2)
## Comparison
all.equal(scores, scores.man)
## Average Brier score for each forecaster
with(WorldEvents, tapply(scores, forecaster, mean))
## Brier scores for 3 alternatives, with bounds of 0 and 1
data("WeatherProbs")
scores2 < calcscore(tcat ~ tblw + tnrm + tabv, fam="pow",
param=2, data=WeatherProbs,
bounds=c(0,1))
## Spherical scores for 3 alternatives, reversed so 0 is worst and
## 1 is best
scores3 < calcscore(tcat ~ tblw + tnrm + tabv, fam="sph",
param=2, data=WeatherProbs,
bounds=c(0,1), reverse=TRUE)
## Replicate Jose, Nau, & Winkler, 2009, Figure 1
r2 < seq(0, .6, .05)
r < cbind(.4, r2, .6  r2)
j < rep(1, length(r2))
## Panel 1
quad < calcscore(j ~ r, fam="pow", param=2, bounds=c(1,1), reverse=TRUE)
quadbase < calcscore(j ~ r, fam="pow", param=c(2,.3,.6,.1), reverse=TRUE)
rankquad < calcscore(j ~ r, fam="pow", param=2, ordered=TRUE, reverse=TRUE)
rankquadbase < calcscore(j ~ r, fam="pow", param=c(2,.3,.6,.1), ordered=TRUE,
reverse=TRUE)
plot(r2, quad, ylim=c(2,1), type="l", ylab="Quadratic scores")
lines(r2, quadbase, lty=2)
lines(r2, rankquad, type="o", pch=22)
lines(r2, rankquadbase, type="o", pch=2)
## Panel 2
sph < calcscore(j ~ r, fam="sph", param=2, reverse=TRUE, bounds=c(1.75,1))
sphbase < calcscore(j ~ r, fam="sph", param=c(2,.3,.6,.1), reverse=TRUE)
ranksph < calcscore(j ~ r, fam="sph", param=2, ordered=TRUE, reverse=TRUE)
ranksphbase < calcscore(j ~ r, fam="sph", param=c(2,.3,.6,.1), ordered=TRUE,
reverse=TRUE)
plot(r2, sph, ylim=c(1,.6), type="l", ylab="Spherical scores")
lines(r2, sphbase, lty=2)
lines(r2, ranksph, type="o", pch=22)
lines(r2, ranksphbase, type="o", pch=2)
## Panel 3
lg < calcscore(j ~ r, fam="pow", param=1.001, reverse=TRUE)
lgbase < calcscore(j ~ r, fam="pow", param=c(1.001,.3,.6,.1), reverse=TRUE)
ranklg < calcscore(j ~ r, fam="pow", param=1.001, ordered=TRUE, reverse=TRUE)
ranklgbase < calcscore(j ~ r, fam="pow", param=c(1.001,.3,.6,.1),
ordered=TRUE, reverse=TRUE)
plot(r2, lg, ylim=c(2,1), type="l", ylab="Log scores")
lines(r2, lgbase, lty=2)
lines(r2, ranklg, type="o", pch=22)
lines(r2, ranklgbase, type="o", pch=2)

[1] "Mean relative difference: 9.537645e07"
1 2 3 4 5 6 7
0.42143333 0.02136665 0.27086667 0.33333333 0.24000000 0.21496667 0.28623333
Warning message:
In calcscore.default(object = list(tblw = c(0.4341, 0.4243, 0.4443, :
Forecasts in some rows do not sum to 1; they were scaled to sum to 1.
Warning message:
In calcscore.default(object = list(tblw = c(0.4341, 0.4243, 0.4443, :
Forecasts in some rows do not sum to 1; they were scaled to sum to 1.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.