splogit: Linearized GMM spatial logit
In McSpatial: Nonparametric spatial data analysis

Description Usage Arguments Details Value References See Also Examples

Implements the Klier-McMillen (2008) linearized GMM logit model for a 0-1 dependent variable and an underlying latent variable of the form Y^* = ρ WY^* + X β +u

 
splogit(form,inst=NULL,winst=NULL,wmat=NULL,shpfile=NULL,blockid=NULL,
         minblock=NULL,maxblock=NULL,data=NULL,silent=FALSE,minp=NULL)

`form`	Model formula
`inst`	List of instruments not to be pre-multiplied by W. Entered as inst=~w1+w2 ... Default: inst=NULL. See details for more information.
`winst`	List of instruments to be pre-multiplied by W before use. Entered as winst=~w1+w2 ... Default: inst=NULL. See details for more information.
`wmat`	Directly enter wmat rather than creating it from a shape file. Default: not specified. One of the wmat or shpfile options must be specified.
`shpfile`	Shape file to be used for creating the W matrix. Default: not specified. One of the wmat or shpfile options must be specified. The order of the observations in wmat must be the same as the order of observations in data.
`blockid`	A variable identifying groups used to specify a block diagonal structure for the W matrix, e.g., blockid=state or blockid=region. Calculates a separate W matrix for each block. The shpfile option must be specified; wmat is ignored.
`minblock`	Groups with fewer than minblock observations are omitted. Default is the number of explanatory variables, including WXB. This option helps to avoid singularity since the instrumental variables are constructed by a separate regression for each block.
`maxblock`	Groups with more than maxblock observations are omitted. Unlimited by default. This option may be useful for very large data sets as full nblock x nblock matrices must be constructed for each block, where nblock is the number of observations in the block.
`data`	A data frame containing the data. Default: use data in the current working directory
`silent`	If silent=T, no output is printed
`minp`	Specifies a limit for the estimated probability. Any estimated probability lower than minp will be set to minp and any probability higher than 1-minp will be set to 1-minp. By default, the estimated probabilities are bounded by 0 and 1.

The linearized model is a three-step estimation procedure. Let y be the indicator value: y = 1 when y* > 0 and y = 0 when y* < 0. The first stage is standard logit of y on X. The probability estimates from this regression are p = exp(X β)/(1+exp(X β)). The second/third stage of the procedure is standard 2SLS estimation of u = y-p+gX β on gX and gWX β using Z as instruments. g is the gradient vector, dp/dβ. The covariance matrix (equation 3 in Klier-McMillen, 2008) is estimated using the car package. The final estimates minimize (y-p)'Z(Z'Z)^{-1}Z'(y-p) with p linearized around β-logit and p = 0.

splogit provides flexibility in specifying the list of instruments. By default, the instrument list includes X and WX, where X is the original explanatory variable list and W is the spatial weight matrix. Either wmat or shpfile must be specified if inst and winst are set to their default values.

It is also possible to directly specify the full instrument list or to include only a subset of the X variables in the list that is to be pre-multiplied by W. Let list1 and list2 be user-provided lists of the form list=~z1+z2. The combinations of defaults (NULL) and lists for inst produce the following results for Z:

1. inst = NULL, winst = NULL, and either shpfile or wmat specified: Z = (X, WX)

2. inst = list1, winst = NULL, and either shpfile or wmat specified: Z = list1

3. inst = NULL, winst = list2, and either shpfile or wmat specified: Z = (X, W*list2)

4. inst = list1, winst = list2, and either shpfile or wmat specified: Z = (list1, W*list2)

5. inst = list1, winst = list2, and both shpfile and wmat NOT specified: Z = (list1, list2)

Note that when inst=list1 and winst=NULL it is up to the user to specify at least one variable in list1 that is not also included in X.

The difference between cases (4) and (5) is that the list2 variables are left unaltered in case (5) rather than being pre-multiplied by W. The case (5) option makes it possible to avoid manipulations of large matrices from within splogit. The idea is that W*list2 should be calculated prior to running splogit, with the variables implied by W*list2 being provided directly to splogit using the winst option.

`coef`	Coefficient estimates.
`se`	Standard error estimates.
`u`	The generalized error term.
`gmat`	The matrix of gradient terms, G.

Klier, Thomas and Daniel P. McMillen, "Clustering of Auto Supplier Plants in the United States: Generalized Method of Moments Spatial Logit for Large Samples," Journal of Business and Economic Statistics 26 (2008), 460-471.

cparlogit

cparprobit

cparmlogit

gmmlogit

gmmprobit

spprobit

spprobitml

set.seed(9947)
cmap <- readShapePoly(system.file("maps/CookCensusTracts.shp",
  package="McSpatial"))
cmap <- cmap[cmap$CHICAGO==1&cmap$CAREA!="O'Hare",]
wmat <- makew(cmap)$wmat
n = nrow(wmat)
rho = .4
x <- runif(n,0,10)
ystar <- as.numeric(solve(diag(n) - rho*wmat)%*%(x + rnorm(n,0,2)))
y <- ystar>quantile(ystar,.4)
fit <- splogit(y~x,  wmat=wmat)