Spatial simultaneous autoregressive error model estimation
Description
Maximum likelihood estimation of spatial simultaneous autoregressive error models of the form:
y = X beta + u, u = lambda W u + e
where lambda is found by optimize()
first, and beta and other parameters by generalized least squares subsequently. With one of the sparse matrix methods, larger numbers of observations can be handled, but the interval=
argument may need be set when the weights are not rowstandardised. When etype
is “emixed”, a socalled spatial Durbin error model is fitted, while lmSLX
fits an lm
model augmented with the spatially lagged RHS variables, including the lagged intercept when the spatial weights are not rowstandardised. create_WX
creates spatially lagged RHS variables, and is exposed for use in model fitting functions.
Usage
1 2 3 4 5  errorsarlm(formula, data=list(), listw, na.action, weights=NULL,
etype="error", method="eigen", quiet=NULL, zero.policy=NULL,
interval = NULL, tol.solve=1.0e10, trs=NULL, control=list())
lmSLX(formula, data = list(), listw, na.action, weights=NULL, zero.policy=NULL)
create_WX(x, listw, zero.policy=NULL, prefix="")

Arguments
formula 
a symbolic description of the model to be fit. The details
of model specification are given for 
data 
an optional data frame containing the variables in the model. By default the variables are taken from the environment which the function is called. 
listw 
a 
na.action 
a function (default 
weights 
an optional vector of weights to be used in the fitting process. NonNULL weights can be used to indicate that different observations have different variances (with the values in weights being inversely proportional to the variances); or equivalently, when the elements of weights are positive integers w_i, that each response y_i is the mean of w_i unitweight observations (including the case that there are w_i observations equal to y_i and the data have been summarized)  
etype 
default "error", may be set to "emixed" to include the spatially lagged independent variables added to X; when "emixed", the lagged intercept is dropped for spatial weights style "W", that is rowstandardised weights, but otherwise included 
method 
"eigen" (default)  the Jacobian is computed as the product
of (1  rho*eigenvalue) using 
quiet 
default NULL, use !verbose global option value; if FALSE, reports function values during optimization. 
zero.policy 
default NULL, use global option value; if TRUE assign zero to the lagged value of zones without
neighbours, if FALSE assign NA  causing 
interval 
default is NULL, search interval for autoregressive parameter 
tol.solve 
the tolerance for detecting linear dependencies in the columns of matrices to be inverted  passed to 
trs 
default NULL, if given, a vector of powered spatial weights matrix traces output by 
control 
list of extra control arguments  see section below 
x 
model matrix to be lagged 
prefix 
default empty string, may be “lag” in some cases 
Details
The asymptotic standard error of lambda is only computed when
method=eigen, because the full matrix operations involved would be costly
for large n typically associated with the choice of method="spam" or
"Matrix". The same applies to the coefficient covariance matrix. Taken
as the asymptotic matrix from the literature, it is typically badly
scaled, being blockdiagonal, and with the elements involving lambda
being very small, while other parts of the matrix can be very large
(often many orders of magnitude in difference). It often happens that
the tol.solve
argument needs to be set to a smaller value than
the default, or the RHS variables can be centred or reduced in range.
Note that the fitted() function for the output object assumes that the response variable may be reconstructed as the sum of the trend, the signal, and the noise (residuals). Since the values of the response variable are known, their spatial lags are used to calculate signal components (Cressie 1993, p. 564). This differs from other software, including GeoDa, which does not use knowledge of the response variable in making predictions for the fitting data.
Value
A list object of class sarlm
type 
"error" 
lambda 
simultaneous autoregressive error coefficient 
coefficients 
GLS coefficient estimates 
rest.se 
GLS coefficient standard errors (are equal to asymptotic standard errors) 
LL 
log likelihood value at computed optimum 
s2 
GLS residual variance 
SSE 
sum of squared GLS errors 
parameters 
number of parameters estimated 
logLik_lm.model 
Log likelihood of the linear model for lambda=0 
AIC_lm.model 
AIC of the linear model for lambda=0 
coef_lm.model 
coefficients of the linear model for lambda=0 
tarX 
model matrix of the GLS model 
tary 
response of the GLS model 
y 
response of the linear model for lambda=0 
X 
model matrix of the linear model for lambda=0 
method 
the method used to calculate the Jacobian 
call 
the call used to create this object 
residuals 
GLS residuals 
opt 
object returned from numerical optimisation 
fitted.values 
Difference between residuals and response variable 
ase 
TRUE if method=eigen 
se.fit 
Not used yet 
lambda.se 
if ase=TRUE, the asymptotic standard error of lambda 
LMtest 
NULL for this model 
aliased 
if not NULL, details of aliased variables 
LLNullLlm 
Loglikelihood of the null linear model 
Hcov 
Spatial DGP covariance matrix for Hausman test if available 
interval 
line search interval 
fdHess 
finite difference Hessian 
optimHess 

insert 
logical; is TRUE, asymptotic values inserted in fdHess where feasible 
timings 
processing timings 
f_calls 
number of calls to the log likelihood function during optimization 
hf_calls 
number of calls to the log likelihood function during numerical Hessian computation 
intern_classic 
a data frame of detval matrix row choices used by the SE toolbox classic method 
zero.policy 
zero.policy for this model 
na.action 
(possibly) named vector of excluded or omitted observations if nondefault na.action argument used 
weights 
weights used in model fitting 
emixedImps 
for “emixed” models, a list of three impact matrixes (impacts and standard errors) for direct, indirect and total impacts; total impacts calculated using gmodels::estimable 
The internal sar.error.* functions return the value of the log likelihood function at lambda.
The lmSLX
function returns an “lm” object with a “mixedImps” list of three impact matrixes (impacts and standard errors) for direct, indirect and total impacts; total impacts calculated using gmodels::estimable.
Control arguments
 tol.opt:
the desired accuracy of the optimization  passed to
optimize()
(default=square root of double precision machine tolerance, a larger root may be used needed, see help(boston) for an example) returnHcov:
default TRUE, return the Vo matrix for a spatial Hausman test
 pWOrder:
default 250, if returnHcov=TRUE and the method is not “eigen”, pass this order to
powerWeights
as the power series maximum limit fdHess:
default NULL, then set to (method != "eigen") internally; use
fdHess
to compute an approximate Hessian using finite differences when using sparse matrix methods; used to make a coefficient covariance matrix when the number of observations is large; may be turned off to save resources if need be optimHess:
default FALSE, use
fdHess
from nlme, if TRUE, useoptim
to calculate Hessian at optimum optimHessMethod:
default “optimHess”, may be “nlm” or one of the
optim
methods LAPACK:
default FALSE; logical value passed to
qr
in the SSE log likelihood function compiled_sse:
default FALSE; logical value used in the log likelihood function to choose compiled code for computing SSE
 Imult:
default 2; used for preparing the Cholesky decompositions for updating in the Jacobian function
 super:
if NULL (default), set to FALSE to use a simplicial decomposition for the sparse Cholesky decomposition and method “Matrix_J”, set to
as.logical(NA)
for method “Matrix”, if TRUE, use a supernodal decomposition cheb_q:
default 5; highest power of the approximating polynomial for the Chebyshev approximation
 MC_p:
default 16; number of random variates
 MC_m:
default 30; number of products of random variates matrix and spatial weights matrix
 spamPivot:
default “MMD”, alternative “RCM”
 in_coef
default 0.1, coefficient value for initial Cholesky decomposition in “spam_update”
 type
default “MC”, used with method “moments”; alternatives “mult” and “moments”, for use if
trs
is missing,trW
 correct
default TRUE, used with method “moments” to compute the Smirnov/Anselin correction term
 trunc
default TRUE, used with method “moments” to truncate the Smirnov/Anselin correction term
 SE_method
default “LU”, may be “MC”
 nrho
default 200, as in SE toolbox; the size of the first stage lndet grid; it may be reduced to for example 40
 interpn
default 2000, as in SE toolbox; the size of the second stage lndet grid
 small_asy
default TRUE; if the method is not “eigen”, use asymmetric covariances rather than numerical Hessian ones if n <= small
 small
default 1500; threshold number of observations for asymmetric covariances when the method is not “eigen”
 SElndet
default NULL, may be used to pass a precomputed SE toolbox style matrix of coefficients and their lndet values to the "SE_classic" and "SE_whichMin" methods
 LU_order
default FALSE; used in “LU_prepermutate”, note warnings given for
lu
method pre_eig
default NULL; may be used to pass a precomputed vector of eigenvalues
Author(s)
Roger Bivand Roger.Bivand@nhh.no
References
Cliff, A. D., Ord, J. K. 1981 Spatial processes, Pion; Ord, J. K. 1975 Estimation methods for models of spatial interaction, Journal of the American Statistical Association, 70, 120126; Anselin, L. 1988 Spatial econometrics: methods and models. (Dordrecht: Kluwer); Anselin, L. 1995 SpaceStat, a software program for the analysis of spatial data, version 1.80. Regional Research Institute, West Virginia University, Morgantown, WV; Anselin L, Bera AK (1998) Spatial dependence in linear regression models with an introduction to spatial econometrics. In: Ullah A, Giles DEA (eds) Handbook of applied economic statistics. Marcel Dekker, New York, pp. 237289; Cressie, N. A. C. 1993 Statistics for spatial data, Wiley, New York; LeSage J and RK Pace (2009) Introduction to Spatial Econometrics. CRC Press, Boca Raton.
Roger Bivand, Gianfranco Piras (2015). Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63(18), 136. http://www.jstatsoft.org/v63/i18/.
Bivand, R. S., Hauke, J., and Kossowski, T. (2013). Computing the Jacobian in Gaussian spatial autoregressive models: An illustrated comparison of available methods. Geographical Analysis, 45(2), 150179.
See Also
lm
, lagsarlm
, similar.listw
, summary.sarlm
, predict.sarlm
,
residuals.sarlm
, do_ldet
, estimable
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96  data(oldcol)
lw < nb2listw(COL.nb, style="W")
COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen", quiet=FALSE)
summary(COL.errW.eig, correlation=TRUE)
ev < eigenw(similar.listw(lw))
COL.errW.eig_ev < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen", control=list(pre_eig=ev))
all.equal(coefficients(COL.errW.eig), coefficients(COL.errW.eig_ev))
COL.errB.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
nb2listw(COL.nb, style="B"), method="eigen", quiet=FALSE)
summary(COL.errB.eig, correlation=TRUE)
W < as(nb2listw(COL.nb), "CsparseMatrix")
trMatc < trW(W, type="mult")
COL.errW.M < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix", quiet=FALSE, trs=trMatc)
summary(COL.errW.M, correlation=TRUE)
COL.SDEM.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen", etype="emixed")
summary(COL.SDEM.eig, correlation=TRUE)
summary(impacts(COL.SDEM.eig))
summary(impacts(COL.SDEM.eig), adjust_k=TRUE)
COL.SLX < lmSLX(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw)
summary(COL.SLX)
summary(impacts(COL.SLX))
COL.SLX < lmSLX(CRIME ~ INC + HOVAL + I(HOVAL^2), data=COL.OLD, listw=lw)
summary(COL.SLX)
COL.SLX < lmSLX(CRIME ~ INC, data=COL.OLD, listw=lw)
crds < cbind(COL.OLD$X, COL.OLD$Y)
mdist < sqrt(sum(diff(apply(crds, 2, range))^2))
dnb < dnearneigh(crds, 0, mdist)
dists < nbdists(dnb, crds)
f < function(x, form, data, dnb, dists, verbose) {
glst < lapply(dists, function(d) 1/(d^x))
lw < nb2listw(dnb, glist=glst, style="B")
res < logLik(lmSLX(form=form, data=data, listw=lw))
if (verbose) cat("power:", x, "logLik:", res, "\n")
res
}
opt < optimize(f, interval=c(0.1, 4), form=CRIME ~ INC + HOVAL,
data=COL.OLD, dnb=dnb, dists=dists, verbose=TRUE, maximum=TRUE)
glst < lapply(dists, function(d) 1/(d^opt$maximum))
lw < nb2listw(dnb, glist=glst, style="B")
SLX < lmSLX(CRIME ~ INC + HOVAL, data=COL.OLD, listw=lw)
summary(SLX)
summary(impacts(SLX))
NA.COL.OLD < COL.OLD
NA.COL.OLD$CRIME[20:25] < NA
COL.err.NA < errorsarlm(CRIME ~ INC + HOVAL, data=NA.COL.OLD,
nb2listw(COL.nb), na.action=na.exclude)
COL.err.NA$na.action
COL.err.NA
resid(COL.err.NA)
lw < nb2listw(COL.nb, style="W")
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen"))
ocoef < coefficients(COL.errW.eig)
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen", control=list(LAPACK=FALSE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="eigen", control=list(compiled_sse=TRUE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix_J", control=list(super=TRUE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix_J", control=list(super=FALSE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix_J", control=list(super=as.logical(NA))))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix", control=list(super=TRUE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix", control=list(super=FALSE)))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="Matrix", control=list(super=as.logical(NA))))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="spam", control=list(spamPivot="MMD")))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="spam", control=list(spamPivot="RCM")))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="spam_update", control=list(spamPivot="MMD")))
all.equal(ocoef, coefficients(COL.errW.eig))
system.time(COL.errW.eig < errorsarlm(CRIME ~ INC + HOVAL, data=COL.OLD,
lw, method="spam_update", control=list(spamPivot="RCM")))
all.equal(ocoef, coefficients(COL.errW.eig))
