gmmprobit: GMM Spatial Probit
In McSpatial: Nonparametric spatial data analysis

Description Usage Arguments Details Value References See Also Examples

Estimates a GMM probit model for a 0-1 dependent variable and an underlying latent variable of the form Y^* = ρ WY^* + X β +u

1
2
3

gmmprobit(form,inst=NULL,winst=NULL,wmat=NULL,shpfile,
  startb=NULL,startrho=0,blockid=0,cvcrit=.0001,data=NULL,silent=FALSE)

`form`	Model formula
`inst`	List of instruments not to be pre-multiplied by W. Entered as inst=~w1+w2 ... Default: inst=NULL. See details for more information.
`winst`	List of instruments to be pre-multiplied by W before use. Entered as winst=~w1+w2 ... Default: inst=NULL. See details for more information.
`wmat`	Directly enter wmat rather than creating it from a shape file. Default: not specified. One of the wmat or shpfile options must be specified.
`shpfile`	Shape file to be used for creating the W matrix. Default: not specified. One of the wmat or shpfile options must be specified.
`startb`	Vector of starting values for B. Default: use estimates from spprobit, the linearized version of the model. Specified as startb=0.
`startrho`	Vector of starting values for ρ. Default: use estimates from spprobit, the linearized version of the model. Specified as startrho=0.
`blockid`	A variable identifying groups used to specify a block diagonal structure for the W matrix, e.g., blockid=state or blockid=region. Imposes that all elements outside of the blocks equal zero and then re-standardizes W such that the rows sum to one. By default, blockid = 0, and a block diagonal structure is not imposed.
`cvcrit`	Convergence criterion. Default: cvcrit = 0.0001.
`data`	A data frame containing the data. Default: use data in the current working directory.
`silent`	If silent=T, no output is printed

The underlying latent variable for the model is Y* = ρ WY* + X β + u or Y* = (I - ρ W)^{-1}(X β + u). The covariance matrix is Euu' = σ^2 ((I - ρ W)(I - ρ W)')^{-1}, with σ^2 normalized to unity. Typical specifications imply heteroskedasticity, i.e., the diagonal elements of Euu', denoted by σ_i^2, vary across observations. Heteroskedasticity makes standard probit estimates inconsistent. Letting X_i^* = X_i/ σ_i and H = (I - ρ W)^{-1} X^*, the probit probabilities implied by the latent variable are p = Φ(H β) and the generalized error term is e = (y - p)φ(H β)/(p(1-p)) , where y = 1 if Y^* >0 and y = 0 otherwise.

The GMM estimator chooses β and ρ to minimize e'Z(Z'Z)^{-1}Z'e, where Z is a matrix of instruments specified using the inst and winst options. Unless specified otherwise using the startb and startrho options, initial estimates are obtained using spprobit, which implements the simple (and fast) linearized version of the GMM probit model proposed by Klier and McMillen (2008). Convergence is defined by abs(change) < cvcrit, where change is the gradient vector implied by applying a standard Gauss-Newton algorithm to the objective function. The covariance matrix (equation 3 in Klier-McMillen, 2008) is estimated using the car package.

Estimation can be very slow because each iteration requires the inversion of an nxn matrix. To speed up the estimation process and to reduce memory requirements, it may be desirable to impose a block diagonal structure on W. For example, it may be reasonable to impose that each state or region has its own error structure, with no correlation of errors across regions. The blockid option specifies a block diagonal structure such as blockid=region. The option leads the program to re-calculate the W matrix, imposing the block diagonal structure and re-normalizing the matrix to again have each row sum to one. If there are G groups, estimation requires G sub-matrices to be inverted rather than one nxn matrix, which greatly reduces memory requirements and significantly reduces the time required in estimation.

gmmprobit provides flexibility in specifying the list of instruments. By default, the instrument list includes X and WX, where X is the original explanatory variable list and W is the spatial weight matrix. It is also possible to directly specify the full instrument list or to include only a subset of the X variables in the list that is to be pre-multiplied by W.

Let list1 and list2 be user-provided lists of the form list=~z1+z2. The combinations of defaults (NULL) and lists for inst alter the final list of instruments as follows:

inst = NULL, winst = NULL: Z = (X, WX)
inst = list1, winst = NULL: Z = list1
inst = NULL, winst = list2: Z = (X, W*list2)
inst = list1, winst = list2: Z = (list1, W*list2)

Note that when inst=list1 and winst=NULL it is up to the user to specify at least one variable in list1 that is not also included in X.

`coef`	Coefficient estimates
`se`	Standard error estimates

Klier, Thomas and Daniel P. McMillen, "Clustering of Auto Supplier Plants in the United States: Generalized Method of Moments Spatial probit for Large Samples," Journal of Business and Economic Statistics 26 (2008), 460-471.

Pinkse, J. and M. E. Slade, "Contracting in Space: An Application of Spatial Statistics to Discrete-Choice Models," Journal of Econometrics 85 (1998), 125-154.

cparlogit

cparprobit

cparmlogit

gmmlogit

splogit

spprobit

set.seed(9947)
cmap <- readShapePoly(system.file("maps/CookCensusTracts.shp",
  package="McSpatial"))
cmap <- cmap[cmap$CHICAGO==1&cmap$CAREA!="O'Hare",]
lmat <- coordinates(cmap)
dnorth <- geodistance(lmat[,1],lmat[,2], -87.627800, 
	41.881998, dcoor=TRUE)$dnorth
cmap <- cmap[dnorth>0,]
wmat <- makew(cmap)$wmat
n = nrow(wmat)
rho = .4
x <- runif(n,0,10)
ystar <- as.numeric(solve(diag(n) - rho*wmat)%*%(x + rnorm(n,0,2)))
y <- ystar>quantile(ystar,.4)
fit <- gmmprobit(y~x,  wmat=wmat)