| GeoScores | R Documentation |
The function computes predictive scores for observed validation values and predicted values, optionally using predictive variances or 95 percent prediction intervals.
GeoScores(data_to_pred,
probject = NULL,
pred = NULL,
mse = NULL,
score = c("pe", "crps", "intscore", "coverage"),
lower95 = NULL,
upper95 = NULL,
threshold = 0.5,
na.rm = TRUE)
data_to_pred |
Numeric vector, matrix or array containing the observed validation values. Values are internally coerced to a numeric vector. |
probject |
Optional object of class |
pred |
Numeric vector, matrix or array of point predictions. This argument is
required when |
mse |
Optional numeric vector, matrix or array of prediction variances. Predictive
standard errors are computed as |
score |
Character vector specifying which predictive scores should be computed.
Possible values are |
lower95 |
Optional numeric vector, matrix or array containing the lower bounds of the
95 percent prediction intervals. If supplied together with
|
upper95 |
Optional numeric vector, matrix or array containing the upper bounds of the
95 percent prediction intervals. See |
threshold |
Numeric threshold used to compute the Brier score |
na.rm |
Logical. If |
GeoScores computes point prediction scores and probabilistic
predictive scores. Predictions and prediction variances can be supplied
directly through pred and mse, or indirectly through a
GeoKrig or GeoKrigloc object supplied to probject.
The point prediction option "pe" returns the mean absolute error and
the root mean squared prediction error.
The scores "brie", "lscore", "pit" and "crps"
are computed under the assumption that the predictive distribution is
Gaussian, with mean equal to the point prediction and standard deviation equal
to the predictive standard error. If mse is supplied, the predictive
standard error is sqrt(mse). If mse is absent and both
lower95 and upper95 are supplied, the predictive standard error
is approximated as
\frac{U - L}{2 \Phi^{-1}(0.975)},
where L and U are the lower and upper bounds of the 95 percent
prediction interval.
The Brier score is computed for the binary event Y > threshold, using
the Gaussian predictive probability
1 - \Phi((threshold - \hat{Y}) / s), where \hat{Y} is the
prediction and s is the predictive standard error.
The logarithmic score is returned as the negative log predictive density, averaged over validation observations:
\frac{1}{2}\left\{\log(2\pi\sigma^2) +
\left(\frac{y - \hat{y}}{\sigma}\right)^2\right\}.
The CRPS is computed using the closed-form expression for a Gaussian predictive distribution.
For "pit", if probject is a GeoKrig or
GeoKrigloc object and GeoPit is available, the PIT values are
obtained through GeoPit(probject, type = "Uniform")$data; otherwise
they are computed as pnorm(data_to_pred, mean = pred, sd = sqrt(mse)).
The 95 percent interval score is computed as
(u - l) +
\frac{2}{\alpha}(l - y) I(y < l) +
\frac{2}{\alpha}(y - u) I(y > u),
with \alpha = 0.05. If lower95 and upper95 are supplied,
they are used directly as l and u; otherwise the interval is
computed from the Gaussian predictive distribution using pred and
mse. The empirical coverage is the proportion of validation values
falling inside the corresponding 95 percent intervals.
A list containing only the components requested through score. Possible
components are:
MAE |
Mean absolute error. Returned when |
RMSPE |
Root mean squared prediction error. Returned when |
Brier |
Brier score for the binary event |
LogScore |
Mean negative log predictive density under a Gaussian predictive
distribution. Returned when |
PIT |
Probability integral transform values. Returned when |
CRPS |
Mean continuous ranked probability score under a Gaussian predictive
distribution. Returned when |
IS95 |
Mean interval score for the 95 percent prediction intervals. Returned when
|
Cvg95 |
Empirical coverage of the 95 percent prediction intervals. Returned when
|
Moreno Bevilacqua, moreno.bevilacqua89@gmail.com, https://sites.google.com/view/moreno-bevilacqua/home,
Víctor Morales Oñate, victor.morales@uv.cl, https://sites.google.com/site/moralesonatevictor/,
Christian Caamaño-Carrillo, chcaaman@ubiobio.cl, https://www.researchgate.net/profile/Christian-Caamano
Gneiting, T. and Raftery, A. E. (2007). Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 102, 359–378.
Heaton, M. J., Datta, A., Finley, A. O., Furrer, R., Guinness, J., Guhaniyogi, R., Gerber, F., Gramacy, R. B., Hammerling, D., Katzfuss, M., Lindgren, F., Nychka, D. W., Sun, F., and Zammit-Mangion, A. (2019). A Case Study Competition Among Methods for Analyzing Large Spatial Data. Journal of Agricultural, Biological, and Environmental Statistics, 24, 398–425.
library(GeoModels)
################################################################
######### Example of predictive score computation #############
################################################################
model <- "Gaussian"
set.seed(79)
N <- 1000
x <- runif(N, 0, 1)
y <- runif(N, 0, 1)
coords <- cbind(x, y)
# Set covariance parameters
corrmodel <- "GenWend"
mean <- 0
sill <- 5
nugget <- 0
scale <- 0.2
smooth <- 0
power2 <- 4
param <- list(mean = mean, sill = sill, nugget = nugget,
scale = scale, smooth = smooth, power2 = power2)
# Simulation of the spatial Gaussian random field
data <- GeoSim(coordx = coords, corrmodel = corrmodel,
param = param)$data
# Training and validation split
sel <- sample(1:N, N * 0.8)
coords_est <- coords[sel, ]
coords_to_pred <- coords[-sel, ]
data_est <- data[sel]
data_to_pred <- data[-sel]
# Pairwise likelihood fitting
fixed <- list(nugget = nugget, smooth = smooth, power2 = power2)
start <- list(mean = 0, scale = scale, sill = 1)
I <- Inf
lower <- list(mean = -I, scale = 0, sill = 0)
upper <- list(mean = I, scale = I, sill = I)
fit <- GeoFit(data_est, coordx = coords_est,
corrmodel = corrmodel, model = model,
likelihood = "Marginal", type = "Pairwise",
neighb = 3, optimizer = "nlminb",
lower = lower, upper = upper,
start = start, fixed = fixed)
# Prediction at validation locations
pr <- GeoKrig(fit,loc = coords_to_pred, data = data_est, mse = TRUE)
# Predictive scores from pred and mse
Pr_scores <- GeoScores(data_to_pred, pred = pr$pred, mse = pr$mse,
score = c("pe", "brie", "crps", "lscore",
"pit", "intscore", "coverage"),
threshold = 0)
Pr_scores$MAE
Pr_scores$RMSPE
Pr_scores$Brier
Pr_scores$CRPS
Pr_scores$LogScore
Pr_scores$IS95
Pr_scores$Cvg95
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.