Implements the Klier-McMillen (2008) linearized GMM probit model for a 0-1 dependent variable
and an underlying latent variable of the form *Y^* = ρ WY^* + X β +u*

`form ` |
Model formula |

`inst` |
List of instruments |

`winst` |
List of instruments to be pre-multiplied by |

`wmat` |
Directly enter |

`shpfile` |
Shape file to be used for creating the |

`blockid` |
A variable identifying groups used to specify a block diagonal structure for the |

`minblock` |
Groups with fewer than |

`maxblock` |
Groups with more than |

`data ` |
A data frame containing the data. Default: use data in the current working directory |

`silent ` |
If |

`minp` |
Specifies a limit for the estimated probability. Any estimated probability lower than |

The linearized model is a three-step estimation procedure. Let *y* be the indicator value: *y* = 1 when *y* > 0* and *y* = 0
when *y* < 0*.
The first stage is standard probit of *y* on *X*. The probability estimates from this regression are *p = Φ(X β)*
and the generalized error is *e = (y-p)*φ(X β)/(p(1-p))*.
The second/third stage of the procedure is standard 2SLS estimation of *u = e + gX β* on
*gX* and *gWX β* using *Z *as instruments, where *g* is the gradient vector, *-de/d β*.
The covariance matrix (equation 3 in Klier-McMillen, 2008) is estimated using the *car* package.
The final estimates minimize *e'Z(Z'Z)^{-1}Z'e* with *e* linearized around *β-probit* and *p* = 0.

*spprobit* provides flexibility in specifying the list of instruments.
By default, the instrument list includes *X* and *WX*, where *X* is the original explanatory variable list and *W* is the spatial weight matrix.
Either *wmat* or *shpfile* must be specified if *inst* and *winst* are set to their default values.

It is also possible to directly specify the full instrument list or to include only a subset of the *X* variables
in the list that is to be pre-multiplied by *W*. Let *list1* and *list2* be user-provided lists of the form *list=~z1+z2*.
The combinations of defaults (*NULL*) and lists for *inst* produce the following results for *Z*:

1. *inst = NULL*, *winst = NULL*, and either *shpfile* or *wmat* specified: *Z = (X, WX)*

2. *inst = list1*, *winst = NULL*, and either *shpfile* or *wmat* specified: *Z = list1*

3. *inst = NULL*, *winst = list2*, and either *shpfile* or *wmat* specified: *Z = (X, W*list2)*

4. *inst = list1*, *winst = list2*, and either *shpfile* or *wmat* specified: *Z = (list1, W*list2)*

5. *inst = list1*, *winst = list2*, and both *shpfile* and *wmat* NOT specified: *Z = (list1, list2)*

Note that when *inst=list1* and *winst=NULL* it is up to the user to specify at least one variable in *list1* that is not also included in *X*.

The difference between cases (4) and (5) is that the *list2* variables are left unaltered in case (5) rather than being pre-multiplied by *W*.
The case (5) option makes it possible to avoid manipulations of large matrices from within *spprobit*. The idea is that
W*list2 should be calculated prior to running *spprobit*, with the variables implied by W*list2 being provided directly to *spprobit* using
the *winst* option.

`coef` |
Coefficient estimates. |

`se` |
Standard error estimates. |

`u` |
The generalized error term. |

`gmat` |
The matrix of gradient terms, G. |

Klier, Thomas and Daniel P. McMillen, "Clustering of Auto Supplier Plants in the United States: Generalized Method of Moments Spatial Logit for Large Samples," *Journal of
Business and Economic Statistics* 26 (2008), 460-471.

cparlogit

cparprobit

cparmlogit

gmmlogit

gmmprobit

splogit

spprobitml

set.seed(9947)
cmap <- readShapePoly(system.file("maps/CookCensusTracts.shp",
package="McSpatial"))
cmap <- cmap[cmap$CHICAGO==1&cmap$CAREA!="O'Hare",]
wmat <- makew(cmap)$wmat
n = nrow(wmat)
rho = .4
x <- runif(n,0,10)
ystar <- as.numeric(solve(diag(n) - rho*wmat)%*%(x + rnorm(n,0,2)))
y <- ystar>quantile(ystar,.4)
fit <- spprobit(y~x, wmat=wmat)
```
Loading required package: lattice
Loading required package: locfit
locfit 1.5-9.1 2013-03-22
Loading required package: maptools
Loading required package: sp
Checking rgeos availability: TRUE
Loading required package: quantreg
Loading required package: SparseM
Attaching package: 'SparseM'
The following object is masked from 'package:base':
backsolve
Loading required package: RANN
Warning message:
use rgdal::readOGR or sf::st_read
Loading required package: Matrix
Call:
glm(formula = form, family = binomial(link = "probit"), data = data)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.9120 -0.5265 0.1089 0.5291 2.6652
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.99691 0.13396 -14.91 <2e-16 ***
x 0.49558 0.02884 17.19 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1159.41 on 860 degrees of freedom
Residual deviance: 628.63 on 859 degrees of freedom
AIC: 632.63
Number of Fisher Scoring iterations: 6
STANDARD PROBIT ESTIMATES
LINEARIZED GMM PROBIT ESTIMATES
Estimate Std. Error z-value Pr(>|z|)
(Intercept) -2.27695 0.12206 -18.65500 0e+00
x 0.50788 0.02434 20.86551 0e+00
WXB 0.44670 0.09812 4.55247 1e-05
Number of observations = 861
```

