Description Usage Arguments Details Value Author(s) References See Also Examples
Functions to compute and plot summary statistics of prediction errors to (cross)validate fitted spatial linear models by the criteria proposed by Gneiting et al. (2007) for assessing probabilistic forecasts.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  validate.predictions(data, pred, se.pred,
statistic = c("crps", "pit", "mc", "bs", "st"), ncutoff = NULL)
## S3 method for class 'cv.georob'
plot(x,
type = c("sc", "lgn.sc", "ta", "qq", "hist.pit", "ecdf.pit", "mc", "bs"),
smooth = TRUE, span = 2/3, ncutoff = NULL, add = FALSE,
col, pch, lty, main, xlab, ylab, ...)
## S3 method for class 'cv.georob'
print(x, digits = max(3, getOption("digits")  3), ...)
## S3 method for class 'cv.georob'
summary(object, se = FALSE, ...)

data 
a numeric vector with observations about a response (mandatory argument). 
pred 
a numeric vector with predictions for the response (mandatory argument). 
se.pred 
a numeric vector with prediction standard errors (mandatory argument). 
statistic 
character keyword defining what statistic of the prediction errors should be computed. Possible values are (see Details):

ncutoff 
positive integer (N) giving the number of quantiles,
for which CDFs are evaluated ( 
x, object 
objects of class 
digits 
positive integer indicating the number of decimal digits to print. 
type 
character keyword defining what type of plot is created by the

smooth 
control whether scatter plots of data vs. predictions
should be smoothed by 
span 
smoothness parameter for loess (see 
add 
logical controlling whether the current highlevel plot is
added to an existing graphics without cleaning the frame before (default:

main, xlab, ylab 
title and axes labels of plot. 
col, pch, lty 
color, symbol and line type. 
se 
logical controlling if the standard errors of the averaged
continuous ranked probability score and of the mean and dispersion
statistics of the prediction errors (see Details) are computed
from the respective values computed for the K crossvalidation
subsets (default: 
... 
additional arguments passed to the methods. 
validate.predictions
computes the items required to evaluate (and
plot) the diagnostic criteria proposed by Gneiting et al. (2007) for
assessing the calibration and the sharpness of
probabilistic predictions of (cross)validation data. To this aim,
validate.predictions
uses the assumption that the prediction
errors
Y(s)hatY(s)
follow normal distributions with zero mean and standard deviations equal
to the Kriging standard errors. This assumption is an approximation if
the errors ε come from a longtailed
distribution. Furthermore, for the time being, the Kriging variance of
the response Y is approximated by adding the estimated
nugget hatτ^2 to the Kriging variance of the
signal Z. This approximation likely underestimates the mean
squared prediction error of the response if the errors come from a
longtailed distribution. Hence, for robust Kriging, the standard errors of
the (cross)validation errors are likely too small.
Notwithstanding these difficulties and imperfections, validate.predictions
computes
the probability integral transform (PIT),
PIT_i = F_i(y_i),
where F_i(y_i) denotes the (plugin) predictive CDF evaluated at y_i, the value of the ith (cross)validation datum,
the average predictive CDF (plugin)
barF_n(y)=1/n ∑_{i=1}^n F_i(y),
where n is the number of (cross)validation observations and the F_i are evaluated at N quantiles equal to the set of distinct y_i (or a subset of size N of them),
the Brier Score (plugin)
BS(y) = 1/n ∑_{i=1}^n (F_i(y)  I(y_i ≤q y) )^2,
where I(x) is the indicator function for the event x, and the Brier score is again evaluated at the unique values of the (cross)validation observations (or a subset of size N of them),
the averaged continuous ranked probability score, CRPS, a strictly proper scoring criterion to rank predictions, which is related to the Brier score by
CRPS = \int_{∞}^∞ BS(y) dy.
Gneiting et al. (2007) proposed the following plots to validate probabilistic predictions:
A histogram (or a plot of the empirical CDF) of the PIT values. For ideal predictions, with observed coverages of prediction intervals matching nominal coverages, the PIT values have an uniform distribution.
Plots of barF_n(y) and of the empirical CDF of the data, say hat{G}_n(y), and of their difference, barF_n(y)hat{G}_n(y) vs y. The forecasts are said to be marginally calibrated if barF_n(y) and hat{G}_n(y) match.
A plot of BS(y) vs. y. Probabilistic predictions are said to be sharp if the area under this curve, which equals CRPS, is minimized.
The plot
method for class cv.georob
allows to create
these plots, along with scatterplots of observations and predictions,
TukeyAnscombe and normal QQ plots of the standardized prediction
errors.
summary.cv.georob
computes the mean and dispersion statistics
of the (standardized) prediction errors (by a call to
validate.prediction
with argument statistic = "st"
, see
Value) and the averaged continuous ranked probability score
(crps
). If present in the cv.georob
object, the error
statistics are also computed for the errors of the unbiasedly
backtransformed predictions of a logtransformed response. If se
is TRUE
then these statistics are evaluated separately for the
K crossvalidation subsets and the standard errors of the means of
these statistics are returned as well.
The print
method for class cv.georob
returns the mean
and dispersion statistics of the (standardized) prediction errors.
Depending on the argument statistic
, the function
validate.predictions
returns
a numeric vector of PIT values if statistic
is equal to "pit"
,
a named numeric vector with summary statistics of the
(standardized) prediction errors if statistic
is equal to "st"
. The
following statistics are computed:
me  mean prediction error 
mede  median prediction error 
rmse  root mean squared prediction error 
made  median absolute prediction error 
qne  Qn dispersion measure of prediction errors
(see Qn ) 
msse  mean squared standardized prediction error 
medsse  median squared standardized prediction error 
a data frame if statistic
is equal to "mc"
or
"bs"
with the components (see Details):
z  the sorted unique (cross)validation
observations (or a subset of size
ncutoff of them) 
ghat  the empirical CDF of the (cross)validation observations hat{G}_n(y) 
fbar  the average predictive distribution barF_n(y) 
bs  the Brier score BS(y) 
Andreas Papritz [email protected]
Gneiting, T., Balabdaoui, F. and Raftery, A. E.(2007) Probabilistic forecasts, calibration and sharpness. Journal of the Royal Statistical Society Series B 69, 243–268.
georob
for (robust) fitting of spatial linear models;
cv.georob
for assessing the goodness of a fit by georob
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28  ## Not run:
data(meuse)
r.logzn < georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
variogram.model = "RMexp",
param = c(variance = 0.15, nugget = 0.05, scale = 200),
tuning.psi = 1)
r.logzn.cv.1 < cv(r.logzn, seed = 1, lgn = TRUE)
r.logzn.cv.2 < cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE)
summary(r.logzn.cv.1, se = TRUE)
summary(r.logzn.cv.2, se = TRUE)
op < par(mfrow = c(2,2))
plot(r.logzn.cv.1, type = "lgn.sc")
plot(r.logzn.cv.2, type = "lgn.sc", add = TRUE, col = "red")
abline(0, 1, lty= "dotted")
plot(r.logzn.cv.1, type = "ta")
plot(r.logzn.cv.2, type = "ta", add = TRUE, col = "red")
abline(h=0, lty= "dotted")
plot(r.logzn.cv.2, type = "mc", add = TRUE, col = "red")
plot(r.logzn.cv.1, type = "bs")
plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red")
legend("topright", lty = 1, col = c("black", "red"), bty = "n",
legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq"))
par(op)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.