Likelihood Based Parameter Estimation for Gaussian Random Fields using Rcpp package.

Share:

Description

Maximum likelihood (ML) or restricted maximum likelihood (REML) parameter estimation for (transformed) Gaussian random fields.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
likfit(geodata, coords = geodata$coords, data = geodata$data,
       trend = "cte", ini.cov.pars, fix.nugget = FALSE, nugget = 0,
       fix.kappa = TRUE, kappa = 0.5, fix.lambda = TRUE, lambda = 1,
       fix.psiA = TRUE, psiA = 0, fix.psiR = TRUE, psiR = 1, 
       cov.model, realisations, lik.method = "ML", components = TRUE,
       nospatial = TRUE, limits = pars.limits(),
       print.pars = FALSE, messages, ...)

## S3 method for class 'likGRF'
fitted(object, spatial = TRUE, ...)

## S3 method for class 'likGRF'
resid(object, spatial = FALSE, ...)

Arguments

geodata

a list containing elements coords and data as described next. Typically an object of the class "geodata". If not provided the arguments coords and data must be provided instead.

coords

an n x 2 matrix where each row has the 2-D coordinates of the n data locations. By default it takes the component coords of the argument geodata, if provided.

data

a vector with n data values. By default it takes the component data of the argument geodata, if provided.

trend

specifies the mean part of the model. See documentation of trend.spatial for further details. Defaults to "cte".

ini.cov.pars

initial values for the covariance parameters: sigma^2 (partial sill) and phi (range parameter). Typically a vector with two components. However a matrix can be used to provide several initial values. See DETAILS below.

fix.nugget

logical, indicating whether the parameter tau^2 (nugget variance) should be regarded as fixed (fix.nugget = TRUE) or should be estimated (fix.nugget = FALSE). Defaults to FALSE.

nugget

value of the nugget parameter. Regarded as a fixed value if fix.nugget = TRUE otherwise as the initial value for the minimisation algorithm. Defaults to zero.

fix.kappa

logical, indicating whether the extra parameter kappa should be regarded as fixed (fix.kappa = TRUE) or should be estimated (fix.kappa = FALSE). Defaults to TRUE.

kappa

value of the extra parameter kappa. Regarded as a fixed value if fix.kappa = TRUE otherwise as the initial value for the minimisation algorithm. Defaults to 0.5. This parameter is valid only if the covariance function is one of: "matern", "powered.exponential", "cauchy" or "gneiting.matern". For more details on covariance functions see documentation for cov.spatial.

fix.lambda

logical, indicating whether the Box-Cox transformation parameter lambda should be regarded as fixed (fix.lambda = TRUE) or should be be estimated (fix.lambda = FALSE). Defaults to TRUE.

lambda

value of the Box-Cox transformation parameter lambda. Regarded as a fixed value if fix.lambda = TRUE otherwise as the initial value for the minimisation algorithm. Defaults to 1. Two particular cases are lambda = 1 indicating no transformation and lambda = 0 indicating log-transformation.

fix.psiA

logical, indicating whether the anisotropy angle parameter psi_R should be regarded as fixed (fix.psiA = TRUE) or should be estimated (fix.psiA = FALSE). Defaults to TRUE.

psiA

value (in radians) for the anisotropy angle parameter psi_A. Regarded as a fixed value if fix.psiA = TRUE otherwise as the initial value for the minimisation algorithm. Defaults to 0. See coords.aniso for further details on anisotropy correction.

fix.psiR

logical, indicating whether the anisotropy ratio parameter psi_R should be regarded as fixed (fix.psiR = TRUE) or should be estimated (fix.psiR = FALSE). Defaults to TRUE.

psiR

value, always greater than 1, for the anisotropy ratio parameter psi_R. Regarded as a fixed value if fix.psiR = TRUE otherwise as the initial value for the minimisation algorithm. Defaults to 1. See coords.aniso for further details on anisotropy correction.

cov.model

a string specifying the model for the correlation function. For further details see documentation for cov.spatial. Reads values from an variomodel object passed to ini.cov.pars if any, otherwise defaults to the exponential model.

realisations

optional. Logical or a vector indicating the number of replication for each datum. For further information see DETAILS below and documentation for as.geodata.

lik.method

(formely method.lik) options are "ML" for maximum likelihood and "REML" for restricted maximum likelihood. Defaults to "ML".

components

an n x 3 data-frame with fitted values for the three model components: trend, spatial and residuals. See the section DETAILS below for the model specification.

nospatial

logical. If TRUE parameter estimates for the model without spatial component are included in the output.

limits

values defining lower and upper limits for the model parameters used in the numerical minimisation. The auxiliary function pars.limits is called to set the limits. See also Limits in DETAILS below.

print.pars

logical. If TRUE the parameters and the value of the negative log-likelihood (up to a constant) are printed each time the function to be minimised is called.

messages

logical. Indicates whether status messages should be printed on the screen (or output device) while the function is running.

...

additional parameters to be passed to the minimisation function. Typically arguments of the type control() which controls the behavior of the minimisation algorithm. For further details see documentation for the minimisation function optim.

object

an object with output of the function likfit.

spatial

logical, determines whether the spatial component of the model in included in the output. The geostatistical model components are: trend, spatial and residulas. See DETAILS.

Details

This function estimate the parameters of the Gaussian random field model, specified as:

Y(x) = mu(x) + S(x) + e

where

  • x defines a spatial location. Typically Euclidean coordinates on a plane.

  • Y is the variable been observed.

  • mu(x) = X %*% beta is the mean component of the model (trend).

  • S(x) is a stationary Gaussian process with variance sigma^2 (partial sill) and a correlation function parametrized in its simplest form by phi (the range parameter). Possible extra parameters for the correlation function are the smoothness parameter kappa and the anisotropy parameters phi_R and phi_A (anisotropy ratio and angle, respectively).

  • e is the error term with variance parameter tau^2 (nugget variance).

The additional parameter lambda allows for the Box-Cox transformation of the response variable. If used (i.e. if λ \neq 1) Y(x) above is replaced by g(Y(x)) such that

g(Y(x)) = ((Y^lambda(x)) - 1)/lambda .

Two particular cases are lambda = 1 which indicates no transformation and lambda = 0 indicating the log-transformation.

Numerical minimization

In general parameter estimation is performed numerically using the R function optim to minimise the negative log-likelihood computed by the function negloglik.GRF. If the nugget, anisotropy (psiA, psiR), smoothness (kappa) and transformation (lambda) parameters are held fixed then the numerical minimisation can be reduced to one-dimension and the function optimize is used instead of optim. In this case initial values are irrelevant.

Limits

Lower and upper limits for parameter values can be individually specified using the function link{pars.limits}. For example, including the following in the function call:
limits = pars.limits(phi=c(0, 10), lambda=c(-2.5, 2.5)),
will change the limits for the parameters phi and lambda. Default values are used if the argument limits is not provided.

There are internal reparametrisation depending on the options for parameters to be estimated. For instance for the common situation when fix.nugget=FALSE the minimisation is performed in a reduced parameter space using tau^2_{rel} = tau^2/sigma^2. In this case values of sigma^2 and beta are then given by analytical expressions which are function of the two parameters remaining parameters and limits for these two parameters will be ignored.

Since parameter values are found by numerical optimization using the function optim, in given circunstances the algorithm may not converge to correct parameter values when called with default options and the user may need to pass extra options for the optimizer. For instance the function optim takes a control argument. The user should try different initial values and if the parameters have different orders of magnitude may need to use options to scale the parameters. Some possible workarounds in case of problems include:

  • rescale you data values (dividing by a constant, say)

  • rescale your coordinates (subtracting values and/or dividing by constants)

  • Use the mechanism to pass control() options for the optimiser internally

Transformation If the fix.lambda = FALSE and nospatial = FALSE the Box-Cox parameter for the model without the spatial component is obtained numerically, with log-likelihood computed by the function boxcox.ns.

Multiple initial values can be specified providing a n x 2 matrix for the argument ini.cov.pars and/or providing a vector for the values of the remaining model parameters. In this case the log-likelihood is computed for all combinations of the model parameters. The parameter set which maximises the value of the log-likelihood is then used to start the minimisation algorithm.

Alternatively the argument ini.cov.pars can take an object of the class eyefit or variomodel. This allows the usage of an output of the functions eyefit, variofit or likfit be used as initial value.

The argument realisations allows sets of data assumed to be independent replications of the same process. Data on different realisations may or may not be co-located. For instance, data collected at different times can be pooled together in the parameter estimation assuming time independence. The argument realisations takes a vector indicating the replication number (e.g. the times). If realisations = TRUE the code looks for an element named realisations in the geodata object. The log-likelihoods are computed for each replication and added together.

Value

An object of the classes "likGRF" and "variomodel".
The function summary.likGRF is used to print a summary of the fitted model.
The object is a list with the following components:

cov.model

a string with the name of the correlation function.

nugget

value of the nugget parameter tau^2. This is an estimate if fix.nugget = FALSE otherwise, a fixed value.

cov.pars

a vector with the estimates of the parameters sigma^2 and phi, respectively.

kappa

value of the smoothness parameter. Valid only if the correlation function is one of: "matern", "powered.exponential", "cauchy" or "gneiting.matern".

beta

estimate of mean parameter beta. This can be a scalar or vector depending on the trend (covariates) specified in the model.

beta.var

estimated variance (or covariance matrix) for the mean parameter beta.

lambda

values of the Box-Cox transformation parameter. A fixed value if fix.lambda = TRUE otherwise the estimate value.

aniso.pars

fixed values or estimates of the anisotropy parameters, according to the function call.

method.lik

estimation method used, "ML" (maximum likelihood) or "REML" (restricted maximum likelihood).

loglik

the value of the maximized likelihood.

npars

number of estimated parameters.

AIC

value of the Akaike Information Criteria, AIC=-2 ln(L) + 2 p where L is the maximised likelihood and p is the number of parameters in the model.

BIC

value of the Bayesian Information Criteria, BIC=-2ln(L) + p log(n), where n is the number of data, L,p as for AIC above.

parameters.summary

a data-frame with all model parameters, their status (estimated or fixed) and values.

info.minimisation

results returned by the minimisation function.

max.dist

maximum distance between 2 data points. This information relevant for other functions which use outputs from likfit.

trend

the trend (covariates) matrix X.

log.jacobian

numerical value of the logarithm of the Jacobian of the transformation.

nospatial

estimates for the model without the spatial component.

call

the function call.

Author(s)

Please contact Jason Lessels jlessels@gmail.com about this package. All code was originally written by;
Paulo Justiniano Ribeiro Jr. paulojus@leg.ufpr.br,
Peter J. Diggle p.diggle@lancaster.ac.uk.

References

Further information on the package geoR can be found at:
http://www.leg.ufpr.br/geoR.

See Also

summary.likGRF for summary of the results, plot.variogram, lines.variogram and lines.variomodel for graphical output, proflik for computing profile likelihoods, variofit and for other estimation methods, and optim for the numerical minimisation function.

Examples

1
2
3
4
5
6
7
8
system.time(reml.1 <- likfit(s100, ini=c(0.5, 0.5), fix.nug = TRUE, lik.met = "REML"))
system.time(reml.2 <- likfitRcpp(s100, ini=c(0.5, 0.5), fix.nug = TRUE, lik.met = "REML"))

summary(reml.1)
summary(reml.2)
plot(variog(s100))
lines(reml.1)
lines(reml.2, lty = 2, col="red")