Description Usage Arguments Details Value Author(s) References See Also Examples
This function assesses the goodness-of-fit of a spatial linear model by
K-fold cross-validation. In more detail, the model is re-fitted
K times by robust (or Gaussian) REML, excluding each time
1/Kth of the data. The re-fitted models are used to compute robust
(or customary) external kriging predictions for the omitted observations.
If the response variable is log-transformed then the kriging predictions
can be optionally transformed back to the orginal scale of the
measurements. S3methods for evaluating and plotting diagnostic summaries
of the cross-validation errors are decribed for the function
validate.predictions
.
1 2 3 4 5 6 7 8 9 10 | ## S3 method for class 'georob'
cv(object, formula = NULL, subset = NULL, nset = 10,
seed = NULL, sets = NULL, duplicates.in.same.set = TRUE,
re.estimate = TRUE, param = object[["param"]],
fit.param = object[["initial.objects"]][["fit.param"]],
aniso = object[["aniso"]][["aniso"]],
fit.aniso = object[["initial.objects"]][["fit.aniso"]],
return.fit = FALSE, reduced.output = TRUE, lgn = FALSE,
mfl.action = c("offset", "stop"),
ncores = min(nset, detectCores()), verbose = 0, ...)
|
object |
an object of class of |
formula |
an optional formula for the regression model passed by
|
subset |
an optional vector specifying a subset of observations to be used in the fitting process, see Details. |
nset |
positive integer defining the number K of subsets
into which the data set is partitioned (default: |
seed |
optional integer seed to initialize random number generation,
see |
sets |
an optional vector of the same length as the response vector
of the fitted model and with positive integers taking values in
(1,2,…,K), defining in this way the K subsets into which
the data set is split. If |
duplicates.in.same.set |
logical controlling whether replicated
observations at a given location are assigned to the same subset when
partitioning the data (default |
re.estimate |
logical controlling whether the model is re-fitted to
the reduced data sets before computing the kriging predictions
( |
param |
an optional named numeric vector or a matrix or data frame
with variogram parameters passed by |
fit.param |
an optional named logical vector or a matrix or data
frame defining which variogram parameters should be adjusted when passed
by |
aniso |
an optional named numeric vector or a matrix or data frame
with anisotropy parameters passed by |
fit.aniso |
an optional named logical vector or a matrix or data
frame defining which anisotropy parameters should be adjusted when passed
by |
return.fit |
logical controlling whether information about the fit
should be returned for when re-estimating the model with the reduced data
sets (default |
reduced.output |
logical controlling whether the complete fitted
model objects, fitted to the reduced data sets, are returned
( |
lgn |
logical controlling whether kriging predictions of a
log-transformed response should be transformed back to the original scale
of the measurements (default |
mfl.action |
character controlling what is done when some levels of
factor(s) are not present in any of the subsets used to fit the model.
The function either stops ( |
ncores |
positive integer controlling how many cores are used for parallelized computations, see Details. |
verbose |
positive integer controlling logging of diagnostic
messages to the console during model fitting. Passed by
|
... |
additional arguments passed by |
Note that the dataframe passed as data
argument to georob
must exist in the user workspace
when calling cv.georob
.
cv.georob
then uses the package parallel for parallelized
cross-validation. By default, the function uses K CPUs but not
more than are physically available (as returned by
detectCores
).
cv.georob
uses the function update
to
re-estimated the model with the reduced data sets. Therefore, any
argument accepted by georob
can be changed when re-fitting
the model. Some of them (e.g. formula
, subset
, etc.) are
explicit arguments of cv.georob
, but also the remaining ones can
be passed to the function.
Practitioners in geostatistics commonly cross-validate a fitted model
without re-estimating the model parameters with the reduced data sets.
This is clearly an unsound practice (see Hastie et al., 2009, sec.
7.10). Therefore, the argument re.estimate
should always be set
to TRUE
. The alternative is provided only for historic reasons.
An object of class cv.georob
, which is a list with the two
components pred
and fit
.
pred
is a data frame with the coordinates and the
cross-validation prediction results with the following variables:
subset |
an integer vector defining to which of the K subsets an observation was assigned. |
data |
the values of the (possibly log-transformed) response. |
pred |
the kriging predictions. |
se |
the kriging standard errors. |
If lgn = TRUE
then pred
has the additional variables:
lgn.data |
the untransformed response. |
lgn.pred |
the unbiasedly back-transformed predictions of a log-transformed response. |
lgn.se |
the kriging standard errors of the back-transformed predictions of a log-transformed response. |
The second component fit
contains either the full outputs of
georob
, fitted for the K reduced data set
(reduced.output = FALSE
), or K lists with the components
tuning.psi
, converged
,
convergence.code
,
gradient
, variogram.model
, param
,
aniso$aniso
, coefficients
along with the standard errors of
hatβ, see
georobObject
.
Andreas Papritz andreas.papritz@env.ethz.ch
Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning; Data Mining, Inference and Prediction. New York: Springer-Verlag.
validate.predictions
for computing statistics of the cross-validation errors;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
predict.georob
for computing robust kriging predictions.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ## Not run:
data( meuse )
r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
variogram.model = "exponential",
param = c( variance = 0.15, nugget = 0.05, scale = 200 ),
tuning.psi = 1)
r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE )
r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE )
plot(r.logzn.cv.1, type = "bs")
plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red")
legend("topright", lty = 1, col = c( "black", "red"), bty = "n",
legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq"))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.