spprobitml: Maximum Likelihood Estimation of a Spatial Probit Model
In McSpatial: Nonparametric spatial data analysis

Description Usage Arguments Details Value References See Also Examples

Probit estimation for a model with an underlying latent variable of the form Y^* = ρ WY^* + X β +u

1 2	spprobitml(form,wmat=NULL,shpfile=NULL,blockid=NULL, minblock=NULL,maxblock=NULL,stdprobit=TRUE,data=NULL)

`form`	Model formula
`wmat`	Directly enter wmat rather than creating it from a shape file. Default: not specified. One of the wmat or shpfile options must be specified.
`shpfile`	Shape file to be used for creating the W matrix. Default: not specified. One of the wmat or shpfile options must be specified. The order of the observations in wmat must be the same as the order of observations in data.
`blockid`	A variable identifying groups used to specify a block diagonal structure for the W matrix, e.g., blockid=state or blockid=region. Calculates a separate W matrix for each block. The shpfile option must be specified; wmat is ignored.
`minblock`	Groups with fewer than minblock observations are omitted. Default is the number of explanatory variables, including the spatial lag term.
`maxblock`	Groups with more than maxblock observations are omitted. Unlimited by default. This option may be useful for very large data sets as full nblock x nblock matrices must be constructed for each block, where nblock is the number of observations in the block.
`stdprobit`	If TRUE, also prints standard probit model results. Default: stdprobit=TRUE.
`data`	A data frame containing the data. Default: use data in the current working directory

Estimation is based on the reduced form of the spatial AR model, Y^* = (I - ρ W)^{-1}(X β + u). The model structure typically implies heteroskedasticity: the variance of the reduced form error term, (I - ρ W)^{-1}u, is σ^2 diag \{ (I - ρ W)^{-1}(I - ρ W')^{-1} \} . For probit estimation, σ^2 is normalized to one. Let s_i^2 denote the variance for observation i, and define X^* = (I - ρ W)^{-1}X. Then the probability that Y_i^* > 0 is Φ (X_i^* β / s_i) , and the log-likelihood function is ∑_i \{ y_i ln (Φ_i ) + (1-y_i ) ln(1-Φ_i) \} . The spprobitml commands estimates the model by maximizing this log-likelihood function with respect to β and ρ.

Variants of this approach – maximizing the log-likelihood function implied by the reduced form of the model – were proposed by Case (1992) and McMillen (1992). Case's estimation procedure relies on a simple form of the spatial weight matrix in which each observation within a district is affected equally by the other observations in the district. McMillen's (1992) approach is equivalent to the one used here, but he suggested using an EM algorithm to estimate the model. Neither author suggested a covariance matrix: Case (1992) appears to have relied on the standard probit estimate which applies when the model is estimated conditional on ρ, while McMillen (1992) proposed a bootstrap approach.

A consistent covariance matrix can be calculated using the gradient terms:

V(\hat{θ})^{-1} = ≤ft( ∑_i \partial lnL_i / \partial \hat{θ} \right)≤ft( ∑_i \partial lnL_i / \partial \hat{θ}' \right)

The gradient term for \hat{ρ} is calculated using numeric derivatives. The covariance matrix, V(\hat{θ}), is not fully efficient because the estimation procedure only indirectly takes into account the autocorrelation structure. An analogous approach is used to calculate standard errors conditional on \hat{ρ}. In the conditional case, only the gradient terms for \hat{β} are used; they are evaluated using the estimated values of ρ.

Estimation can be very slow because each iteration requires the inversion of an nxn matrix. To speed up the estimation process and to reduce memory requirements, it may be desirable to impose a block diagonal structure on W. For example, it may be reasonable to impose that each state or region has its own error structure, with no correlation of errors across regions. The blockid option specifies a block diagonal structure such as blockid=region. If there are G groups, estimation requires G sub-matrices to be inverted rather than one nxn matrix, which greatly reduces memory requirements and significantly reduces the time required in estimation. The weight matrix must be calculated from shpfile if the blockid option is specified; the wmat option should be set to NULL.

`coef`	Coefficient estimates.
`logl`	The log-likelihood value.
`vmat1`	The covariance matrix for \hat{β}, conditional on \hat{ρ}.
`vmat2`	The unconditional covariance matrix for \hat{θ} = (\hat{β}, \hat{ρ}).

Case, Anne C., "Neighborhood Influence and Technological Change," Regional Science and Urban Economics 22 (1992), 491-508.

McMillen, Daniel P., "Probit With Spatial Autocorrelation," Journal of Regional Science 32 (1992), 335-348.

cparlogit

cparprobit

cparmlogit

gmmlogit

gmmprobit

makew

splogit

spprobit

set.seed(9947)
cmap <- readShapePoly(system.file("maps/CookCensusTracts.shp",
  package="McSpatial"))
cmap <- cmap[cmap$CHICAGO==1&cmap$CAREA!="O'Hare",]
lmat <- coordinates(cmap)
dnorth <- geodistance(lmat[,1],lmat[,2], lotarget=-87.627800, 
	latarget=41.881998,dcoor=TRUE)$dnorth
cmap <- cmap[dnorth>1,]
wmat <- makew(cmap)$wmat
n = nrow(wmat)
rho = .4
x <- runif(n,0,10)
ystar <- as.numeric(solve(diag(n) - rho*wmat)%*%(x + rnorm(n,0,2)))
y <- ystar>quantile(ystar,.4)
fit <- spprobitml(y~x,  wmat=wmat)