SRE: Construct SRE object, fit and predict

Description Usage Arguments Details See Also Examples

View source: R/SREutils.R

Description

The Spatial Random Effects (SRE) model is the central object in FRK. The function FRK provides a wrapper for the construction and estimation of the SRE object from data, using the functions SRE (the object constructor) and SRE.fit (for fitting it to the data). Please see SRE-class for more details on the SRE object's properties and methods.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
FRK(f, data, basis = NULL, BAUs = NULL, est_error = TRUE,
  average_in_BAU = TRUE, fs_model = "ind", vgm_model = NULL,
  K_type = "block-exponential", n_EM = 100, tol = 0.01, method = "EM",
  lambda = 0, print_lik = FALSE, ...)

SRE(f, data, basis, BAUs, est_error = TRUE, average_in_BAU = TRUE,
  fs_model = "ind", vgm_model = NULL, K_type = "block-exponential",
  normalise_basis = TRUE)

SRE.fit(SRE_model, n_EM = 100L, tol = 0.01, method = "EM", lambda = 0,
  print_lik = FALSE)

SRE.predict(SRE_model, obs_fs = FALSE, newdata = NULL, pred_polys = NULL,
  pred_time = NULL, covariances = FALSE)

## S4 method for signature 'SRE'
predict(object, newdata = NULL, obs_fs = FALSE,
  pred_polys = NULL, pred_time = NULL, covariances = FALSE)

loglik(SRE_model)

Arguments

f

R formula relating the dependent variable (or transformations thereof) to covariates

data

list of objects of class SpatialPointsDataFrame, SpatialPolygonsDataFrame, STIDF, or STFDF. If using space-time objects, the data frame must have another field, t, containing the time index of the data point

basis

object of class Basis (or TensorP_Basis)

BAUs

object of class SpatialPolygonsDataFrame, SpatialPixelsDataFrame, STIDF, or STFDF. The object's data frame must contain covariate information as well as a field fs describing the fine-scale variation up to a constant of proportionality. If the function FRK is used directly, then BAUs are created automatically, but only coordinates can then be used as covariates

est_error

flag indicating whether the measurement-error variance should be estimated from variogram techniques. If this is set to 0, then data must contain a field std. Measurement-error estimation is currently not implemented for spatio-temporal datasets

average_in_BAU

if TRUE, then multiple data points falling in the same BAU are averaged; the measurement error of the averaged data point is taken as the average of the individual measurement errors

fs_model

if "ind" then the fine-scale variation is independent at the BAU level. If "ICAR", then an ICAR model for the fine-scale variation is placed on the BAUs

vgm_model

an object of class variogramModel from the package gstat constructed using the function vgm. This object contains the variogram model that will be fit to the data. The nugget is taken as the measurement error when est_error = TRUE. If unspecified, the variogram used is gstat::vgm(1, "Lin", d, 1), where d is approximately one third of the maximum distance between any two data points

K_type

the parameterisation used for the K matrix. Currently this can be "unstructured" or "block-exponential" (default)

n_EM

maximum number of iterations for the EM algorithm

tol

convergence tolerance for the EM algorithm

method

parameter estimation method to employ. Currently only “EM” is supported

lambda

ridge-regression regularisation parameter for when K is unstructured (0 by default). Can be a single number, or a vector (one parameter for each resolution)

print_lik

flag indicating whether likelihood value should be printed or not after convergence of the EM estimation algorithm

...

other parameters passed on to auto_basis and auto_BAUs when calling the function FRK

normalise_basis

flag indicating whether to normalise the basis functions so that they reproduce a stochastic process with approximately constant variance spatially

SRE_model

object returned from the constructor SRE() containing all the parameters and information on the SRE model

obs_fs

flag indicating whether the fine-scale variation sits in the observation model (systematic error, Case 1) or in the process model (fine-scale process variation, Case 2, default)

newdata

object of class SpatialPoylgons indicating the regions over which prediction will be carried out. The BAUs are used if this option is not specified

pred_polys

deprecated. Please use newdata instead

pred_time

vector of time indices at which prediction will be carried out. All time points are used if this option is not specified

covariances

logical variable indicating whether prediction covariances should be returned or not. If set to TRUE, a maximum of 4000 prediction locations or polygons are allowed.

object

object of class SRE

Details

SRE() is the main function in the package: It constructs a spatial random effects model from the user-defined formula, data object, basis functions and a set of Basic Areal Units (BAUs). The function first takes each object in the list data and maps it to the BAUs – this entails binning the point-referenced data into the BAUs (and averaging within the BAU) if average_in_BAU = TRUE, and finding which BAUs are influenced by the polygon datasets. Following this, the incidence matrix Cmat is constructed, which appears in the observation model Z = CY + Cδ + e, where C is the incidence matrix and δ is systematic error at the BAU level.

The SRE model for the hidden process is given by Y = Tα + Sη + ξ, where T are the covariates at the BAU level, α are the regression coefficients, S are the basis functions evaluated at the BAU level, η are the basis-function coefficients, and ξ is the fine scale variation (at the BAU level). The covariance matrix of ξ is diagonal, with its diagonal elements proportional to the field ‘fs’ in the BAUs (typically set to one). The constant of proportionality is estimated in the EM algorithm. All required matrices (S,T etc.) are initialised using sensible defaults and returned as part of the object, please see SRE-class for more details.

SRE.fit() takes an object of class SRE and estimates all unknown parameters, namely the covariance matrix K, the fine scale variance (σ^2_{ξ} or σ^2_{δ}, depending on whether Case 1 or Case 2 is chosen; see the vignette) and the regression parameters α. The only method currently implemented is the Expectation Maximisation (EM) algorithm, which the user configures through n_EM and tol. The log-likelihood (given in Section 2.2 of the vignette) is evaluated at each iteration at the current parameter estimate, and convergence is assumed to have been reached when this quantity stops changing by more than tol.

The actual computations for the E-step and M-step are relatively straightforward. The E-step contains an inverse of an r \times r matrix, where r is the number of basis functions which should not exceed 2000. The M-step first updates the matrix K, which only depends on the sufficient statistics of the basis-function coefficients η. Then, the regression parameter α is updated and a simple optimisation routine (a line search) is used to update the fine-scale variance σ^2_{δ} or σ^2_{ξ}. If the fine-scale errors and measurement random errors are homoscedastic, then a closed-form solution is available for the update of σ^2_{ξ} or σ^2_{δ}. Irrespectively, since the udpates of α, and σ^2_{δ} or σ^2_{ξ}, are dependent, these two updates are iterated until the change in σ^2_{\cdot} is no more than 0.1%. Information on the fitting (convergence etc.) can be extracted using info_fit(SRE_model).

The function FRK acts as a wrapper for the functions SRE and SRE.fit. An added advantage of using FRK directly is that it automatically generates BAUs and basis functions based on the data. Hence FRK can be called using only a list of data objects and an R formula, although the R formula can only contain space or time as covariates when BAUs are not explicitly supplied with the covariate data.

Once the parameters are fitted, the SRE object is passed onto the function predict() in order to carry out optimal predictions over the same BAUs used to construct the SRE model with SRE(). The first part of the prediction process is to construct the matrix S over the prediction polygons. This is made computationally efficient by treating the prediction over polygons as that of the prediction over a combination of BAUs. This will yield valid results only if the BAUs are relatively small. Once the matrix S is found, a standard Gaussian inversion (through conditioning) using the estimated parameters is used for prediction.

predict returns the BAUs, which are of class SpatialPolygonsDataFrame, SpatialPixelsDataFrame, or STFDF, with two added attributes, mu and var. These can then be easily plotted using spplot or ggplot2 (possibly in conjunction with SpatialPolygonsDataFrame_to_df) as shown in the package vignettes.

See Also

SRE-class for details on the SRE object internals, auto_basis for automatically constructing basis functions, and auto_BAUs for automatically constructing BAUs. See also the paper https://arxiv.org/abs/1705.08105 for details on code operation.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
library(sp)

### Generate process and data
n <- 100
sim_process <- data.frame(x = seq(0.005,0.995,length=n))
sim_process$y <- 0
sim_process$proc <- sin(sim_process$x*10) + 0.3*rnorm(n)

sim_data <- sim_process[sample(1:n,50),]
sim_data$z <- sim_data$proc + 0.1*rnorm(50)
sim_data$std <- 0.1
coordinates(sim_data) = ~x + y # change into an sp object
grid_BAUs <- auto_BAUs(manifold=real_line(),data=sim_data,
                       nonconvex_hull=FALSE,cellsize = c(0.01),type="grid")
grid_BAUs$fs = 1

### Set up SRE model
G <- auto_basis(manifold = real_line(),
                data=sim_data,
                nres = 2,
                regular = 6,
                type = "bisquare",
                subsamp = 20000)
f <- z ~ 1
S <- SRE(f,list(sim_data),G,
         grid_BAUs,
         est_error = FALSE)

### Fit with 5 EM iterations so as not to take too much time
S <- SRE.fit(S,n_EM = 5,tol = 0.01,print_lik=TRUE)

### Check fit info


### Predict over BAUs
grid_BAUs <- predict(S)

### Plot
## Not run: 
library(ggplot2)
X <- slot(grid_BAUs,"data")
X <- subset(X, x >= 0 & x <= 1)
 g1 <- LinePlotTheme() +
    geom_line(data=X,aes(x,y=mu)) +
    geom_errorbar(data=X,aes(x=x,ymax = mu + 2*sqrt(var), ymin= mu - 2*sqrt(var))) +
    geom_point(data = data.frame(sim_data),aes(x=x,y=z),size=3) +
    geom_line(data=sim_process,aes(x=x,y=proc),col="red")
 print(g1)
## End(Not run)

Example output

Normalising basis function evaluations at BAU level ...
Binning data ...
Binned data in 0.029 seconds

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |======================================================================| 100%
Maximum EM iterations reached

FRK documentation built on May 2, 2019, 8:11 a.m.