dgp_spsur | R Documentation |
The purpose of the function dgp_spsur
is to generate a random
dataset with the dimensions and spatial structure decided by the user.
This function may be useful in pure simulation experiments or with the
aim of showing specific properties and characteristics
of a spatial SUR dataset and inferential procedures related to them.
The user of dgp_spsur
should think in terms of a Monte Carlo
experiment. The arguments of the function specify the dimensions of the
dataset to be generated, the spatial mechanism underlying the data, the
intensity of the SUR structure among the equations and the values of the
parameters to be used to obtain the simulated data, which includes the
error terms, the regressors and the explained variables.
dgp_spsur(Sigma, Tm = 1, G, N, Betas, Thetas = NULL, rho = NULL, lambda = NULL, p = NULL, listw = NULL, X = NULL, type = "matrix", pdfU = "nvrnorm", pdfX = "nvrnorm")
Sigma |
Covariance matrix between the G equations of the SUR model. This matrix should be definite positive and the user must check for that. |
Tm |
Number of time periods. Default = |
G |
Number of equations. |
N |
Number of cross-section or spatial units |
Betas |
A row vector of order (1xP) showing the values for the beta coefficients. The first P_{1} terms correspond to the first equation (where the first element is the intercept), the second P_{2} terms to the coefficients of the second equation and so on. |
Thetas |
Values for the θ coefficients in the
G equations of the model, when the type of spatial SUR model to
be simulated is a "slx", "sdm" or "sdem". Thetas is a
row vector of order 1xPTheta, where
PThetas=p-G; let us note that the intercept cannot
appear among the spatial lags of the regressors. The first
1xKTheta_{1} terms correspond to the first equation,
the second 1xPTheta_{2} terms correspond to the
second equation, and so on. Default = |
rho |
Values of the coefficients ρ_{g}; g=1,2,..., G
related to the spatial lag of the explained variable of the g-th equation.
If rho is an scalar and there are G equations in the
model, the same value will be used for all the equations. If rho
is a row vector, of order (1xG), the function |
lambda |
Values of the coefficients λ_{g}; g=1,2,..., G
related to the spatial lag of the errors in the G equations.
If lambda is an scalar and there are G equations
in the model, the same value will be used for all the equations.
If lambda is a row vector, of order (1xG), the function
|
p |
Number of regressors by equation, including the intercept. p can be a row vector of order (1xG), if the number of regressors is not the same for all the equations, or a scalar, if the G equations have the same number of regressors. |
listw |
A |
X |
This argument tells the function |
type |
Selection of the type of output. The alternatives are
|
pdfU |
Multivariate probability distribution function, Mpdf, from
which the values of the error terms will be drawn. The covariance matrix
is the Σ matrix specified by the user in the argument. Two alternatives
Sigma.
The function |
pdfX |
Multivariate probability distribution function (Mpdf), from
which the values of the regressors will be drawn. The regressors are
assumed to be independent. |
The purpose of the function dgp_spsur
is to generate random
datasets, of a SUR nature, with the spatial structure decided by the user.
The function requires certain information to be supplied externally
because, in fact, dgp_spsur
constitutes a Data Generation
Process, DGP. The following aspects should be addressed:
The user must define the dimensions of the dataset, that is, number of equations, G, number of time periods, Tm, and number of cross-sectional units, N.
The user must choose the type of spatial structure desired for the model from among the list of candidates of "sim", "slx", "slm", "sem", "sdm", "sdem" or "sarar". The default is the "sim" specification which does not have spatial structure. The decision is made implicitly, just omitting the specification of the spatial parameters which are not involved in the model (i.e., in a "slm" there are no λ parameters but appear ρ parameters; in a "sdem" model there are λ and θ parameters but no ρ coefficients).
If the user needs a model with spatial structure, a (NxN) weighting matrix, W, should be chosen.
The next step builds the equations of the SUR model. In this
case, the user must specify the number of regressors that intervene
in each equation and the coefficients, β parameters,
associated with each regressor. The first question is solved
through the argument p which, if a scalar, indicates that
the same number of regressors should appear in all the equations
of the model; if the user seeks for a model with different number
of regressors in the G equations, the argument p must
be a (1xG) row vector with the required information. It must
be remembered that dgp_spsur
assumes that an
intercept appears in all equations of the model.
The second part of the problem posited above is solved through the argument Betas, which is a row vector of order (1xp) with the information required for this set of coefficients.
The user must specify, also, the values of the spatial parameters corresponding to the chosen specification; we are referring to the ρ_{g}, λ_{g} and θ_{g}, for g=1, ..., G and k=1,..., K_{g} parameters. This is done thought the arguments rho, lambda and theta. The firs two, rho and lambda, work as K: if they are scalar, the same value will be used in the G equations of the SUR model; if they are (1xG) row vectors, a different value will be assigned for each equation.
Moreover, Theta works like the argument Betas. The user must define a row vector of order 1xPTheta showing these values. It is worth to remember that in no case the intercept will appear among the lagged regressors.
With the argument type
the user take the decision of the
output format. See Value section.
Finally, the user must decide which values of the regressors and
of the error terms are to be used in the simulation. The regressors
can be uploaded from an external matrix generated previously by the
user. This is the argument X. It is the responsibility of the
user to check that the dimensions of the external matrix are consistent
with the dataset required for the SUR model. A second possibility
implies the regressors to be generated randomly by the function
dgp_spsur
.
In this case, the user must select the probability distribution
function from which the corresponding data (of the regressors and
the error terms) are to be drawn.
dgp_spsur
provides two multivariate distribution functions,
namely, the Normal and the log-Normal for the errors (the second
should be taken as a clear departure from the standard assumption of
normality). In both cases, random matrices of order (TmNxG)
are obtained from a multivariate normal distribution, with a mean
value of zero and the covariance matrix specified in the argument
Sigma; then, this matrix is exponentiated for the log-Normal
case. Roughly, the same procedure applies for drawing the values of
the regressor. There are two distribution functions available, the
normal and the uniform in the interval U[0,1]; the regressors
are always independent.
The default output ("matrix") is a list with a vector Y of order (TmNGx1) with the values generated for the explained variable in the G equations of the SUR and a matrix XX of order ((TmNGxsum(p)), with the values generated for the regressors of the SUR, including an intercept for each equation.
In case of Tm = 1 or G = 1 several alternatives output can be select:
If the user select type = "df"
the output is a data frame where each
column is a variable.
If the user select type = "panel"
the output is a data frame in
panel format including two factors. The first factor point out the observation
of the individual and the second the equation for different Tm or G.
Finally, if type = "all"
is select the output is a list including all
alternatives format.
Fernando Lopez | fernando.lopez@upct.es |
Roman Minguez | roman.minguez@uclm.es |
Jesus Mur | jmur@unizar.es |
Lopez, F. A., Minguez, R., Mur, J. (2020). ML versus IV estimates of spatial SUR models: evidence from the case of Airbnb in Madrid urban area. The Annals of Regional Science, 64(2), 313-347. <doi:10.1007/s00168-019-00914-1>
Minguez, R., Lopez, F.A. and Mur, J. (2022). spsur: An R Package for Dealing with Spatial Seemingly Unrelated Regression Models. Journal of Statistical Software, 104(11), 1–43. <doi:10.18637/jss.v104.i11>
spsurml
, spsur3sls
, spsurtime
## VIP: The output of the whole set of the examples can be examined ## by executing demo(demo_dgp_spsur, package="spsur") ################################################ ### PANEL DATA (Tm = 1 or G = 1) ## ################################################ ################################################ #### Example 1: DGP SLM model. G equations ################################################ rm(list = ls()) # Clean memory Tm <- 1 # Number of time periods G <- 3 # Number of equations N <- 200 # Number of spatial elements p <- 3 # Number of independent variables Sigma <- matrix(0.3, ncol = G, nrow = G) diag(Sigma) <- 1 Betas <- c(1, 2, 3, 1, -1, 0.5, 1, -0.5, 2) rho <- 0.5 # level of spatial dependence lambda <- 0.0 # spatial autocorrelation error term = 0 ## random coordinates co <- cbind(runif(N,0,1),runif(N,0,1)) lw <- spdep::nb2listw(spdep::knn2nb(spdep::knearneigh(co, k = 5, longlat = FALSE))) DGP <- dgp_spsur(Sigma = Sigma, Betas = Betas, rho = rho, lambda = lambda, Tm = Tm, G = G, N = N, p = p, listw = lw) SLM <- spsurml(X = DGP$X, Y = DGP$Y, Tm = Tm, N = N, G = G, p = c(3, 3, 3), listw = lw, type = "slm") summary(SLM) ################################################ ### MULTI-DIMENSIONAL PANEL DATA G>1 and Tm>1 ## ################################################ rm(list = ls()) # Clean memory Tm <- 10 # Number of time periods G <- 3 # Number of equations N <- 100 # Number of spatial elements p <- 3 # Number of independent variables Sigma <- matrix(0.5, ncol = G, nrow = G) diag(Sigma) <- 1 Betas <- rep(1:3, G) rho <- c(0.5, 0.1, 0.8) lambda <- 0.0 # spatial autocorrelation error term = 0 ## random coordinates co <- cbind(runif(N,0,1),runif(N,0,1)) lw <- spdep::nb2listw(spdep::knn2nb(spdep::knearneigh(co, k = 5, longlat = FALSE))) DGP4 <- dgp_spsur(Sigma = Sigma, Betas = Betas, rho = rho, lambda = lambda, Tm = Tm, G = G, N = N, p = p, listw = lw) SLM4 <- spsurml(Y = DGP4$Y, X = DGP4$X, G = G, N = N, Tm = Tm, p = p, listw = lw, type = "slm") summary(SLM4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.