georob: Robust Fitting of Spatial Linear Models
In georob: Robust Geostatistical Analysis of Spatial Data

View source: R/georob_exported_functions.R

georob

R Documentation

Robust Fitting of Spatial Linear Models

Description

The function georob fits a linear model with spatially correlated errors to geostatistical data that are possibly contaminated by independent outliers. The regression coefficients and the parameters of the variogram model are estimated by robust or Gaussian restricted maximum likelihood (REML) or by Gaussian maximum likelihood (ML).

Usage

georob(formula, data, subset, weights, na.action, model = TRUE,
    x = FALSE, y = FALSE, contrasts = NULL, offset, locations,
    variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy",
        "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian",
        "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting",
        "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp",
        "RMspheric", "RMstable", "RMwave", "RMwhittle"),
    param, fit.param = default.fit.param()[names(param)],
	  aniso = default.aniso(), fit.aniso = default.fit.aniso(),
    variogram.object = NULL,
    tuning.psi = 2, control = control.georob(),
    verbose = 0, ...)

Arguments

`formula`	a symbolic description of the regression model for the external drift to be fit (mandatory argument). See `lm` and `formula` for more details.
`data`	an optional data frame, a `SpatialPointsDataFrame`, list or environment (or another object coercible by `as.data.frame` to a data frame) containing the variables in the model and the coordinates where the data was recorded. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `georob` is called.
`subset`	an optional vector specifying a subset of observations to be used in the fitting process.
`weights`	an optional vector of weights to be used in the fitting process, currently ignored.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is set by the `na.action` argument of `options`, and is `na.fail` if that is unset. The “factory-fresh” default is `na.omit`. Another possible value is `NULL`, no action. Value `na.exclude` can be useful.
`model`, `x`, `y`	logical scalars. If `TRUE` the corresponding components of the fit (the model frame, the model matrix, the response) are returned. The model frame is augmented by the coordinates.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`offset`	this optional argument can be used to specify an a priori known component to be included in the linear predictor during fitting. An `offset` term can be included in the formula instead or as well, and if both are specified their sum is used.
`locations`	a one-sided formula defining the variables that are used as coordinates of the locations were the data was recorded (mandatory argument).
`variogram.model`	a character keyword defining the variogram model to be fitted. Currently, most basic variogram models provided formerly by the now archived package RandomFields can be fitted (see Details and `gencorr`).
`param`	a named numeric vector with initial values of the variogram parameters (mandatory argument). The names of `param` are matched against the following names (see Details and `georobPackage` for information about the parametrization of variogram models): `variance`: variance (sill `\sigma^2`) of the auto-correlated component of the Gaussian random field `B(\boldsymbol{s})`. `snugget`: variance (spatial nugget `\sigma^2_{\mathrm{n}}`) of the seemingly spatially uncorrelated component of `B(\boldsymbol{s})` (micro-scale spatial variation; default value `snugget = 0`). `nugget`: variance (nugget `\tau^2`) of the independent errors `\varepsilon(\boldsymbol{s})`. `scale`: range parameter (`\alpha`) of the variogram. names of additional variogram parameters such as the smoothness parameter `\nu` of the Whittle-Matérn model (see `gencorr` and `param.names`).
`fit.param`	a named logical vector (or a function such as `default.fit.param` that creates this vector) with the same names as used for `param`, defining which parameters are adjusted (`TRUE`) and which are kept fixed at their initial values (`FALSE`) when fitting the model.
`aniso`	a named numeric vector with initial values (or a function such as `default.aniso` that creates this vector) for fitting geometrically anisotropic variogram models. The names of `aniso` are matched against the following names (see Details and `georobPackage` for information about the parametrization of variogram models): `f1`: ratio `f_1` of lengths of second and first semi-principal axes of an ellipsoidal surface with constant semi-variance in `\mathrm{I}\!\mathrm{R}^3` (default `f1 = 1`). `f2`: ratio `f_2` of lengths of third and first semi-principal axes of the semi-variance ellipsoid (default `f2 = 1`). `omega`: azimuth in degrees of the first semi-principal axis of the semi-variance ellipsoid (default `omega = 90`). `phi`: 90 degrees minus latitude of the first semi-principal axis of the semi-variance ellipsoid (default `phi = 90`). `zeta`: angle in degrees between the second semi-principal axis and the direction of the line defined by the intersection between the `x`-`y`-plane and the plane orthogonal to the first semi-principal axis of the semi-variance ellipsoid through the origin (default `zeta = 0`).
`fit.aniso`	a named logical vector (or a function such as `default.fit.aniso` that creates this vector) with the same names as used for `aniso`, defining which parameters are adjusted (`TRUE`) and which are kept fixed at their initial values (`FALSE`) when fitting the model.
`variogram.object`	an optional list that defines a possibly nested variogram model. Each component is itself a list with the following components: `variogram.model`: a mandatory character keyword defining the variogram model, see respective argument above. `param`: a mandatory named numeric vector with initial values of the variogram parameters, see respective argument above. `fit.param`: an optional named logical vector defining which parameters are adjusted, see respective argument above. `aniso`: an optional named numeric vector with initial values for fitting geometrically anisotropic variogram models, see respective argument above. `fit.param`: an optional named logical vector defining which anisotropy parameters are adjusted, see respective argument above. Note that the arguments `variogram.model`, `param`, `fit.param`, `aniso` and `fit.aniso` are ignored when `variogram.object` is passed to `georob`.
`tuning.psi`	positive numeric. The tuning constant `c` of the `\psi_c`-function of the robust REML algorithm.
`control`	a list specifying parameters that control the behaviour of `georob`. Use the function `control.georob` and see its help page for the components of `control`.
`verbose`	positive integer controlling logging of diagnostic messages to the console during model fitting. `verbose = 0` largely suppresses such messages and `verbose = 4` asks for most verbose output (see `control` arguments of `nleqslv`, `nlminb` and `optim` and `control.georob` for information how to fine tuning diagnostic output generated by `nleqslv`, `nlminb` and `optim`).
`...`	further arguments passed to function (e.g. `object.` used internally for updating `georob` objects).

Details

georob fits a spatial linear model by robust (Künsch et al., 2011, Künsch et al., in preparation) or Gaussian RE(ML) (Harville, 1977). georobPackage describes the employed model and briefly sketches the robust REML estimation and the robust external drift Kriging method. Here, we describe further details of georob.

Implemented variograms

Currently, most basic variogram models provided formerly by the now archived package RandomFields can be fitted by georob (see argument variogram.model and gencorr for a list of implemented models). Some of these models have in addition to variance, snugget, nugget and scale further parameters. Initial values of these parameters (param) and fitting flags (fit.param) must be passed to georob by the same names as used for the models RM... in gencorr. Use the function param.names to list additional parameters of a given variogram.model.

The arguments fit.param and fit.aniso are used to control what variogram and anisotropy parameters are estimated and which are kept at the constant initial values. The functions
default.fit.param and default.fit.aniso set reasonable default values for these arguments. Note, as an aside, that the function default.aniso sets (default) values of the anisotropy parameters for an isotropic variogram.

Estimating parameters of power function variogram

The intrinsic variogram model RMfbm is over-parametrized when both the variance (plus possibly snugget) and the scale are estimated. Therefore, to estimate the parameters of this model, scale must be kept fixed at an arbitrary value by using fit.param["scale"] = FALSE.

Estimating parameters of geometrically anisotropic variograms

The subsection Model of georobPackage describes how such models are parametrized and gives definitions the various elements of aniso. Some additional remarks might be helpful:

The first semi-principal axis points into the direction with the farthest reaching auto-correlation, which is described by the range parameter scale (\alpha).
The ranges in the direction of the second and third semi-principal axes are given by f_1\alpha and f_2 \alpha, with 0 < f_2 \leq f_1 \leq 1.
The default values for aniso (f_1=1, f_2=1) define an isotropic variogram model.
Valid ranges for the angles characterizing the orientation of the semi-variance ellipsoid are (in degrees): \omega [0, 180], \phi [0, 180], \zeta [-90, 90].

Estimating variance of micro-scale variation

Simultaneous estimation of the variance of the micro-scale variation (snugget, \sigma_\mathrm{n}^2), appears seemingly as spatially uncorrelated with a given sampling design, and of the variance (nugget, \tau^2) of the independent errors requires that for some locations \boldsymbol{s}_i replicated observations are available. Locations less or equal than zero.dist apart are thereby considered as being coincident (see control.georob).

Constraining estimates of variogram parameters

Parameters of variogram models can vary only within certain bounds (see param.bounds and gencorr for allowed ranges). georob uses three mechanisms to constrain parameter estimates to permissible ranges:

Parameter transformations: By default, all variance (variance, snugget, nugget), the range scale, the anisotropy parameters f1 and f2 and many of the additional parameters are log-transformed before solving the estimating equations or maximizing the restricted log-likelihood and this warrants that the estimates are always positive (see control.georob for detailed explanations how to control parameter transformations).
Checking permissible ranges: The additional parameters of the variogram models such as the smoothness parameter \nu of the Whittle-Matérn model are forced to stay in the permissible ranges by signalling an error to nleqslv, nlminb or optim if the current trial values are invalid. These functions then graciously update the trial values of the parameters and carry their task on. However, it is clear that such a procedure likely gets stuck at a point on the boundary of the parameter space and is therefore just a workaround for avoiding runtime errors due to invalid parameter values.
Exploiting the functionality of nlminb and optim: If a spatial model is fitted non-robustly, then the arguments lower, upper (and method of optim) can be used to constrain the parameters (see control.optim how to pass them to optim). For optim one has to use the arguments method = "L-BFGS-B", lower = l, upper = u, where l and u are numeric vectors with the lower and upper bounds of the transformed parameters in the order as they appear in
c(variance, snugget, nugget, scale, ...)[fit.param], aniso[fit.aniso]),
where ... are additional parameters of isotropic variogram models (use
param.names(variogram.model) to display the names and the order of the additional parameters for variogram.model). For nlminb one has to use the arguments lower = l, upper = u, where l and u are numeric vectors as described above.

Computing robust initial estimates of parameters for robust REML

To solve the robustified estimating equations for \boldsymbol{B} and \boldsymbol{\beta} the following initial estimates are used:

\widehat{\boldsymbol{B}}= \boldsymbol{0}, if this turns out to be infeasible, initial values can be passed to georob by the argument bhat of control.georob.
\widehat{\boldsymbol{\beta}} is either estimated robustly by the function lmrob, rq or non-robustly by lm (see argument initial.fixef of control.georob).

Finding the roots of the robustified estimating equations of the variogram and anisotropy parameters is more sensitive to a good choice of initial values than maximizing the Gaussian (restricted) log-likelihood with respect to the same parameters. If the initial values for param and aniso are not sufficiently close to the roots of the system of nonlinear equations, then nleqslv may fail to find them. Setting initial.param = TRUE (see control.georob) allows one to find initial values that are often sufficiently close to the roots so that nleqslv converges. This is achieved by:

Initial values of the regression parameters are computed by lmrob irrespective of the choice for initial.fixef (see control.georob).
Observations with “robustness weights” of the lmrob fit, satisfying
\psi_c(\widehat{\varepsilon}_i/\widehat{\tau})/(\widehat{\varepsilon}_i/\widehat{\tau}) \leq \mbox{\code{min.rweight}}, are discarded (see control.georob).
The model is fit to the pruned data set by Gaussian REML using nlminb or optim.
The resulting estimates of the variogram parameters (param, aniso) are used as initial estimates for the subsequent robust fit of the model by nleqslv.

Note that for step 3 above, initial values of param and aniso must be provided to georob.

Estimating variance parameters by Gaussian (RE)ML

Unlike robust REML, where robustified estimating equations are solved for the variance parameters nugget (\tau^2), variance (\sigma^2), and possibly snugget (\sigma_{\mathrm{n}}^2), for Gaussian (RE)ML the variances can be re-parametrized to

the signal variance \sigma_B^2 = \sigma^2 + \sigma_{\mathrm{n}}^2,
the inverse relative nugget \eta = \sigma_B^2 / \tau^2 and
the relative auto-correlated signal variance \xi = \sigma^2/\sigma_B^2.

georob maximizes then a (restricted) profile log-likelihood that depends only on \eta, \xi, \alpha, ..., and \sigma_B^2 is estimated by an explicit expression that depends on these parameters (e.g. Diggle and Ribeiro, 2006, p. 113). This is usually more efficient than maximizing the (restricted) log-likelihood with respect to the original variance parameters \tau^2, \sigma_{\mathrm{n}}^2 and \sigma^2. georob chooses the parametrization automatically, but the user can control it by the argument reparam of the function control.georob.

Value

An object of class georob representing a robust (or Gaussian) (RE)ML fit of a spatial linear model. See georobObject for the components of the fit.

Author(s)

Andreas Papritz papritz@retired.ethz.ch
with contributions by Cornelia Schwierz.

References

Diggle, P. J. and Ribeiro, P. J. R. (2006) Model-based Geostatistics. Springer, New York, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-0-387-48536-2")}.

Harville, D. A. (1977) Maximum likelihood approaches to variance component estimation and to related problems, Journal of the American Statistical Association, 72, 320–340, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.1977.10480998")}.

Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in preparation) Robust Geostatistics.

Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3929/ethz-a-009900710")}

Examples

################
## meuse data ##
################
data(meuse)

## Gaussian REML fit
r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y,
    variogram.model = "RMexp",
    param = c(variance = 0.15, nugget = 0.05, scale = 200),
    tuning.psi = 1000)
summary(r.logzn.reml, correlation = TRUE)
plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100))

## robust REML fit
if(interactive()){
  ## example is run only in interactive session because cpu times exceeds 5 s
  r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1)

  summary(r.logzn.rob, correlation = TRUE)
  lines(r.logzn.rob, col = "red")
}


###################
## wolfcamp data ##
###################
data(wolfcamp)

## fitting isotropic IRF(0) model

r.irf0.iso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y,
    variogram.model = "RMfbm",
    param = c(variance = 10, nugget = 1500, scale = 1, alpha = 1.5),
    fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
    tuning.psi = 1000)

summary(r.irf0.iso)
plot(r.irf0.iso, lag.dist.def = seq(0, 200, by = 7.5))

## fitting anisotropic IRF(0) model
if(interactive()){
  ## example is run only in interactive session because cpu times exceeds 5 s
  r.irf0.aniso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y,
      variogram.model = "RMfbm",
      param = c(variance = 5.9, nugget = 1450, scale = 1, alpha = 1),
      fit.param = default.fit.param(scale = FALSE, alpha = TRUE),
      aniso = default.aniso(f1 = 0.51, omega = 148.),
      fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE),
      tuning.psi = 1000)
  summary(r.irf0.aniso)
  plot(r.irf0.aniso, lag.dist.def = seq(0, 200, by = 7.5),
      xy.angle.def = c(0, 22.5, 67.5, 112.5, 157.5, 180.),
      add = TRUE, col = 2:5)

  pchisq(2*(r.irf0.aniso[["loglik"]] - r.irf0.iso[["loglik"]]), 2, lower = FALSE)
}

georob documentation built on April 4, 2025, 2:11 a.m.

georob index

Tutorial and Manual for georob

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

georob
Robust Geostatistical Analysis of Spatial Data

georob: Robust Fitting of Spatial Linear Models
In georob: Robust Geostatistical Analysis of Spatial Data

Robust Fitting of Spatial Linear Models

Description

Usage

Arguments

Details

Implemented variograms

Estimating parameters of power function variogram

Estimating parameters of geometrically anisotropic variograms

Estimating variance of micro-scale variation

Constraining estimates of variogram parameters

Computing robust initial estimates of parameters for robust REML

Estimating variance parameters by Gaussian (RE)ML

Value

Author(s)

References

See Also

Examples

Related to georob in georob...

R Package Documentation

Browse R Packages

We want your feedback!

georob Robust Geostatistical Analysis of Spatial Data

georob: Robust Fitting of Spatial Linear Models In georob: Robust Geostatistical Analysis of Spatial Data

Robust Fitting of Spatial Linear Models

Description

Usage

Arguments

Details

Implemented variograms

Estimating parameters of power function variogram

Estimating parameters of geometrically anisotropic variograms

Estimating variance of micro-scale variation

Constraining estimates of variogram parameters

Computing robust initial estimates of parameters for robust REML

Estimating variance parameters by Gaussian (RE)ML

Value

Author(s)

References

See Also

Examples

Related to georob in georob...

R Package Documentation

Browse R Packages

We want your feedback!

georob
Robust Geostatistical Analysis of Spatial Data

georob: Robust Fitting of Spatial Linear Models
In georob: Robust Geostatistical Analysis of Spatial Data