proscore | R Documentation |
Generic function and default method for computing various kinds of scores for fitted or predicted probability distributions from (regression) models.
proscore( object, newdata = NULL, na.action = na.pass, type = "loglikelihood", drop = FALSE, aggregate = FALSE, ... ) ## Default S3 method: proscore( object, newdata = NULL, na.action = na.pass, type = "loglikelihood", drop = FALSE, aggregate = FALSE, ... )
object |
a fitted model object. For the |
newdata |
optionally, a data frame in which to look for variables with which to predict and from which to obtain the response variable. If omitted, the original observations are used. |
na.action |
function determining what should be done with missing
values in |
type |
character specifying the type of score to compute. Avaible types:
|
drop |
logical. Should scores be returned in a data frame (default) or (if possible) dropped to a vector. |
aggregate |
logical or function to be used for aggregating scores across
observations. Setting |
... |
further parameters passed to the |
The function proscore
provides a unified framework for scoring
probabilistic forecasts (in-sample or out-of-sample). The following scores
are currently available, using the following notation: Y is the predicted
random variable with cumulative distribution function F(y) and probability
density function f(y). The actual observation is y.
Log-likelihood: Also known as log-score, logarithmic score, or log-density.
log(f(y))
Continuous ranked probability score (CRPS):
int { F(x) - 1(x >= y) }^2 dx
where 1(.) denotes the indicator function.
In case of a discrete rather than a continuous distribution, the ranked probability score (RPS) is defined analogously using the sum rather than the integral. In other words it is then the sum of the squared deviations between the predicted cumulative probabilities F(x) and the ideal step function for the actual observation y.
Mean absolute error (MAE):
|E(Y) - y|
where E(Y) is the expectation of the predicted random variable Y.
Mean squared error (MSE):
(E(Y) - y)^2
Internally, the default proscore
method first computes the fitted/predicted
probability distribution object using prodist
(corresponding to Y above) and then obtains the corresponding observation
y using newresponse
. Subsequently, the scores are evaluated
using either the log_pdf
method,
crps
method, or simply the mean
. Finally,
the resulting individual scores per observation can be returned as a full
data frame, or aggregated (e.g., by using mean
, sum
, or summary
, etc.).
Either a data.frame
of scores (if drop = FALSE
, default) or
a named numeric vector (if drop = TRUE
and the scores are not a matrix).
The names are the type
specified by the user (i.e., are not canonicalized
by partial matching etc.).
## Poisson regression model for FIFA 2018 data: ## number of goals scored by each team in each game, explained by ## predicted ability difference of the competing teams data("FIFA2018", package = "distributions3") m <- glm(goals ~ difference, data = FIFA2018, family = poisson) ## in-sample log-likelihood proscore(m, type = "loglik", aggregate = sum) logLik(m) ## compute mean of all available scores proscore(m, type = c("LogS", "CRPS", "MAE", "MSE"), aggregate = TRUE) ## note that "LogS" and "loglik" above are both matched to using ## the log-likelihood but the user-supplied spelling is preserved ## prediction using a new data set (final of the tournament) final <- tail(FIFA2018, 2) proscore(m, newdata = final, type = c("LogLik", "CRPS", "MAE", "MSE")) ## least-squares regression for speed and breaking distance of cars data("cars", package = "datasets") m <- lm(dist ~ speed, data = cars) ## replicate in-sample residual sum of squares (aka deviance) by taking ## the sum (rather than the mean) of the squared errors proscore(m, type = "MSE", aggregate = sum) deviance(m) ## note that the log-likelihood does not match exactly due to using ## the least-squares rather than the maximum-likelihood estimate of the ## error variance (divising by n rather than n - k) proscore(m, type = "loglikelihood", aggregate = sum) logLik(m)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.