spsur3sls: Three Stages Least Squares estimation,3sls, of spatial SUR...

View source: R/spsur3sls.R

spsur3slsR Documentation

Three Stages Least Squares estimation,3sls, of spatial SUR models.

Description

The function estimates spatial SUR models using three stages least squares, where the instruments are obtained from the spatial lags of the X variables, assumed to be exogenous. The number of equations, time periods and spatial units is not restricted. The user can choose between a Spatial Durbin Model or a Spatial Lag Model, as described below. The estimation procedure allows for the introduction of linear restrictions on the β parameters associated to the regressors.

Usage

spsur3sls (formula = NULL, data = NULL, na.action,
                  R = NULL, b = NULL, listw = NULL, 
                  zero.policy = NULL, X= NULL, Y = NULL, G = NULL, 
                  N = NULL, Tm = NULL, p = NULL,  
                  type = "slm", Durbin = NULL, maxlagW = NULL,
                  trace = TRUE)

Arguments

formula

An object type Formula similar to objects created with the package Formula describing the equations to be estimated in the model. This model may contain several responses (explained variables) and a varying number of regressors in each equation.

data

An object of class data.frame or a matrix.

na.action

A function (default options("na.action")), can also be na.omit or na.exclude with consequences for residuals and fitted values. It may be necessary to set zero.policy to TRUE because this subsetting may create no-neighbour observations.

R

A row vector of order (1xpr) with the set of r linear constraints on the beta parameters. The first restriction appears in the first p terms, the second restriction in the next p terms and so on. Default = NULL.

b

A column vector of order (rx1) with the values of the linear restrictions on the beta parameters. Default = NULL.

listw

A listw object created for example by nb2listw from spatialreg package; if nb2listw not given, set to the same spatial weights as the listw argument. It can also be a spatial weighting matrix of order (NxN) instead of a listw object. Default = NULL.

zero.policy

Similar to the corresponding parameter of lagsarlm function in spatialreg package. If TRUE assign zero to the lagged value of zones without neighbours, if FALSE assign NA - causing spsurml() to terminate with an error. Default = NULL.

X

A data matrix of order (NTmGxp) with the observations of the regressors. The number of covariates in the SUR model is p = sum(p_{g}) where p_{g} is the number of regressors (including the intercept) in the g-th equation, g = 1,...,G). The specification of "X" is only necessary if not available a Formula and a data frame. Default = NULL.

Y

A column vector of order (NTmGx1), with the observations of the explained variables. The ordering of the data must be (first) equation, (second) time dimension and (third) cross-sectional/spatial units. The specification of Y is only necessary if not available a Formula and a data frame. Default = NULL.

G

Number of equations.

N

Number of cross-section or spatial units

Tm

Number of time periods.

p

Number of regressors by equation, including the intercept. p can be a row vector of order (1xG), if the number of regressors is not the same for all the equations, or a scalar, if the G equations have the same number of regressors. The specification of p is only necessary if not available a Formula and a data frame.

type

Type of spatial model, restricted to cases where lags of the explained variable appear in the rigth hand side of the equations. There are two possibilities: "slm" or "sdm". Default = "slm".

Durbin

If a formula object and model is type "sdm" the subset of explanatory variables to lag for each equation.

maxlagW

Maximum spatial lag order of the regressors employed to produce spatial instruments for the spatial lags of the explained variables. Default = 2. Note that in case of type="sdm", the default value for maxlagW is set to 3 because the first lag of the regressors, WX_{tg}, can not be used as spatial instruments.

trace

A logical value to show intermediate results during the estimation process. Default = TRUE.

Details

spsur3sls can be used to estimate two groups of spatial models:

  • "slm": SUR model with spatial lags of the endogenous in the right hand side of the equations

    y_{tg} = ρ_{g} Wy_{tg} + X_{tg} β_{g} + ε_{tg}

  • "sdm": SUR model of the Spatial Durbin type

    y_{tg} = ρ_{g} Wy_{tg} + X_{tg} β_{g} + WX_{tg} θ_{g} + ε_{tg}

where y_{tg} and ε_{tg} are (Nx1) vectors, corresponding to the g-th equation and time period t; X_{tg} is the matrix of regressors, of order (Nxp_g). Moreover, ρ_{g} is a spatial coefficient and W is a (NxN) spatial weighting matrix.

By default, the input of this function is an object created with Formula and a data frame. However, spsur3sls also allows for the direct specification of vector Y and matrix X, with the explained variables and regressors respectively, as inputs (these terms may be the result, for example, of dgp_spsur).

spsur3sls is a Least-Squares procedure in three-stages designed to circumvent the endogeneity problems due to the presence of spatial lags of the explained variable in the right hand side of the equations do the SUR. The instruments are produced internally by spsur3sls using a sequence of spatial lags of the X variables, which are assumed to be exogenous. The user must define the number of (spatial) instruments to be used in the procedure, through the argument maxlagW (i.e. maxlagW = 3). Then, the collection of instruments generated is [WX_{tg}; W*WX_{tg}; W*W*WX_{tg}]. In the case of a SDM, the first lag of the X matrix already is in the equation and cannot be used as instrument. In the example above, the list of instruments for a SDM model would be [W^{2}X_{tg}; W^{3}X_{tg}].

The first stage of the procedure consists in the least squares of the Y variables on the set of instruments. From this estimation, the procedure retains the estimates of Y in the so-called Yls variables. In the second stage, the Y variables that appear in the right hand side of the equation are substituted by Yls and the SUR model is estimated by Least Squares. The third stage improves the estimates of the second stage through a Feasible Generalized Least Squares estimation of the parameters of the model, using the residuals of the second stage to estimate the Sigma matrix.

The arguments R and b allows to introduce linear restrictions on the beta coefficients of the G equations. spsur3sls, first, introduces the linear restrictions in the SUR model and builds, internally, the corresponding constrained SUR model. Then, the function estimates the restricted model which is shown in the output. The function does not compute the unconstrained model nor test for the linear restrictions. The user may ask for the unconstrained estimation using another spsurml estimation. Moreover, the function wald_betas obtains the Wald test of a set of linear restrictions for an object created previously by spsurml or spsur3sls.

Value

Object of spsur class with the output of the three-stages least-squares estimation of the specified spatial model. A list with:

call Matched call.
type Type of model specified.
Durbin Value of Durbin argument.
coefficients Estimated coefficients for the regressors.
deltas Estimated spatial coefficients.
rest.se Estimated standard errors for the estimates of β coefficients.
deltas.se Estimated standard errors for the estimates of the spatial coefficients.
resvar Estimated covariance matrix for the estimates of beta's and spatial coefficients.
R2 Coefficient of determination for each equation, obtained as the squared of the correlation coefficient between the corresponding explained variable and fitted values.
R2 pooled Global coefficient of determination obtained for the set of the G equations. It is computed in the same way than uniequational R2 but joining the dependent variable and fitted values in single vectors instead of one vector for each equation.
Sigma Estimated covariance matrix for the residuals of the G equations.
residuals Residuals of the model.
df.residuals Degrees of freedom for the residuals.
fitted.values Estimated values for the dependent variables.
G Number of equations.
N Number of cross-sections or spatial units.
Tm Number of time periods.
p Number of regressors by equation (including intercepts).
Y If data is NULL, vector Y of the explained variables of the SUR model.
X If data is NULL, matrix X of the regressors of the SUR model.
W Spatial weighting matrix.
zero.policy Logical value of zero.policy .
listw_style Style of neighborhood matrix W.

Author(s)

Fernando Lopez fernando.lopez@upct.es
Roman Minguez roman.minguez@uclm.es
Jesus Mur jmur@unizar.es

References

  • Anselin, L. (2016) Estimation and Testing in the Spatial Seemingly Unrelated Regression (SUR). Geoda Center for Geospatial Analysis and Computation, Arizona State University. Working Paper 2016-01. <doi:10.13140/RG.2.2.15925.40163>

  • , Anselin, L. (1988). Spatial Econometrics: Methods and Models. Kluwer Academic Publishers, Dordrecht, The Netherlands (p. 146).

  • Anselin, L., Le Gallo, J., Hubert J. (2008) Spatial Panel Econometrics. In The econometrics of panel data. Fundamentals and recent developments in theory and practice. (Chap 19, p. 653)

  • Minguez, R., Lopez, F.A. and Mur, J. (2022). spsur: An R Package for Dealing with Spatial Seemingly Unrelated Regression Models. Journal of Statistical Software, 104(11), 1–43. <doi:10.18637/jss.v104.i11>

  • Lopez, F. A., Minguez, R., Mur, J. (2020). ML versus IV estimates of spatial SUR models: evidence from the case of Airbnb in Madrid urban area. The Annals of Regional Science, 64(2), 313-347. <doi:10.1007/s00168-019-00914-1>

See Also

spsurml, stsls, wald_betas

Examples


#################################################
######## CLASSIC PANEL DATA (G=1; Tm>1)  ########
#################################################

#### Example 1: Spatial Phillips-Curve. Anselin (1988, p. 203)
## A SUR model without spatial effects
rm(list = ls()) # Clean memory
data(spc)
lwspc <- spdep::mat2listw(Wspc, style = "W")
Tformula <- WAGE83 | WAGE81 ~ UN83 + NMR83 + SMSA | UN80 + NMR80 + SMSA

## A SUR-SLM model (3SLS Estimation)
spcsur_slm_3sls <-spsur3sls(formula = Tformula, data = spc,
                            type = "slm", listw = lwspc)
summary(spcsur_slm_3sls)
print(spcsur_slm_3sls)

if (require(gridExtra)) {
  pl <- plot(spcsur_slm_3sls, viewplot = FALSE) 
  grid.arrange(pl$lplbetas[[1]], pl$lplbetas[[2]], 
               pl$pldeltas, nrow = 3)
}

## VIP: The output of the whole set of the examples can be examined 
## by executing demo(demo_spsur3sls, package="spsur")


## A SUR-SDM model (3SLS Estimation)
spcsur_sdm_3sls <- spsur3sls(formula = Tformula, data = spc,
                             type = "sdm", listw = lwspc)
summary(spcsur_sdm_3sls)
if (require(gridExtra)) {
  pl <- plot(spcsur_sdm_3sls, viewplot = FALSE) 
  grid.arrange(pl$lplbetas[[1]], pl$lplbetas[[2]], 
               pl$pldeltas, nrow = 3)
}
rm(spcsur_sdm_3sls)

## A SUR-SDM model with different spatial lags in each equation
 TformulaD <-  ~ UN83 + NMR83 + SMSA | UN80 + NMR80  
 spcsur_sdm2_3sls <-spsur3sls(formula = Tformula, data = spc,
                             type = "sdm", listw = lwspc,
                             Durbin = TformulaD)
 summary(spcsur_sdm2_3sls)
if (require(gridExtra)) {
  pl <- plot(spcsur_sdm2_3sls, viewplot = FALSE) 
  grid.arrange(pl$lplbetas[[1]], pl$lplbetas[[2]], 
               pl$pldeltas, nrow = 3)
}
rm(spcsur_sdm2_3sls)


#################################################
###  MULTI-DIMENSIONAL PANEL DATA (G>1; Tm>1) ###
#################################################

#### Example 3: Homicides + Socio-Economics (1960-90)
# Homicides and selected socio-economic characteristics for continental
# U.S. counties.
# Data for four decennial census years: 1960, 1970, 1980 and 1990.
# https://geodacenter.github.io/data-and-lab/ncovr/


rm(list = ls()) # Clean memory
data(NCOVR, package = "spsur")
nbncovr <- spdep::poly2nb(NCOVR.sf, queen = TRUE)
## Some regions with no links...
lwncovr <- spdep::nb2listw(nbncovr, style = "W", zero.policy = TRUE)
Tformula <- HR80  | HR90 ~ PS80 + UE80 | PS90 + UE90
## A SUR-SLM model
NCOVRSUR_slm_3sls <- spsur3sls(formula = Tformula, data = NCOVR.sf, 
                               type = "slm", zero.policy = TRUE,
                               listw = lwncovr, trace = FALSE)
summary(NCOVRSUR_slm_3sls)
if (require(gridExtra)) {
  pl <- plot(NCOVRSUR_slm_3sls, viewplot = FALSE) 
  grid.arrange(pl$lplbetas[[1]], pl$lplbetas[[2]], 
               pl$pldeltas, nrow = 3)
}
rm(NCOVRSUR_slm_3sls)


spsur documentation built on Oct. 30, 2022, 1:06 a.m.