scoring-functions-binary | R Documentation |
Brier score
The Brier Score is the mean squared error between the probabilistic prediction and the observed outcome. The Brier score is a proper scoring rule. Small values are better (best is 0, the worst is 1).
\textrm{Brier\_Score} = (\textrm{prediction} - \textrm{outcome})^2,
where \textrm{outcome} \in \{0, 1\}
, and
\textrm{prediction} \in [0, 1]
represents
the probability that the outcome is equal to 1.
Log score for binary outcomes
The Log Score is the negative logarithm of the probability assigned to the observed value. It is a proper scoring rule. Small values are better (best is zero, worst is infinity).
brier_score(observed, predicted)
logs_binary(observed, predicted)
observed |
A factor of length n with exactly two levels, holding
the observed values.
The highest factor level is assumed to be the reference level. This means
that |
predicted |
A numeric vector of length n, holding probabilities.
Values represent the probability that the corresponding outcome is equal to
the highest level of the factor |
The functions require users to provide observed values as a factor in order
to distinguish its input from the input format required for scoring point
forecasts. Internally, however, factors will be converted to numeric values.
A factor observed = factor(c(0, 1, 1, 0, 1)
with two levels (0
and 1
)
would internally be coerced to a numeric vector (in this case this would
result in the numeric vector c(1, 2, 2, 1, 1)
). After subtracting 1, the
resulting vector (c(0, 1, 1, 0)
in this case) is used for internal
calculations. All predictions are assumed represent the probability that the
outcome is equal of the last/highest factor level (in this case that the
outcome is equal to 1).
You could alternatively also provide a vector like
observed = factor(c("a", "b", "b", "a"))
(with two levels, a
and b
),
which would result in exactly the same internal representation. Probabilities
then represent the probability that the outcome is equal to "b".
If you want your predictions to be probabilities that the outcome is "a",
then you could of course make observed
a factor with levels swapped, i.e.
observed = factor(c("a", "b", "b", "a"), levels = c("b", "a"))
A numeric vector of size n with the Brier scores
A numeric vector of size n with log scores
Overview of required input format for binary and point forecasts
Other log score functions:
logs_nominal()
,
logs_sample()
observed <- factor(sample(c(0, 1), size = 30, replace = TRUE))
predicted <- runif(n = 30, min = 0, max = 1)
brier_score(observed, predicted)
logs_binary(observed, predicted)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.