# probit_linear_latent: Recursive Probit-Linear Model with Latent First Stage In endogeneity: Recursive Two-Stage Models to Address Endogeneity

 probit_linear_latent R Documentation

## Recursive Probit-Linear Model with Latent First Stage

### Description

Latent version of the Probit-Linear Model.

First stage (Probit, m_i^* is unobserved):

m_i^*=1(\boldsymbol{\alpha}'\mathbf{w_i}+u_i>0)

Second stage (Linear):

y_i = \boldsymbol{\beta}'\mathbf{x_i} + {\gamma}m_i^* + \sigma v_i

Endogeneity structure: u_i and v_i are bivariate normally distributed with a correlation of \rho.

w and x can be the same set of variables. The identification of this model is generally weak, especially if w are not good predictors of m. \gamma is assumed to be positive to ensure that the model estimates are unique.

### Usage

probit_linear_latent(
form_probit,
form_linear,
data = NULL,
EM = TRUE,
par = NULL,
method = "BFGS",
verbose = 0,
maxIter = 500,
tol = 1e-06,
tol_LL = 1e-08
)


### Arguments

 form_probit Formula for the first-stage probit model, in which the dependent variable is latent form_linear Formula for the second stage linear model. The latent dependent variable of the first stage is automatically added as a regressor in this model data Input data, a data frame EM Whether to maximize likelihood use the Expectation-Maximization (EM) algorithm, which is slower but more robust. Defaults to TRUE. par Starting values for estimates method Optimization algorithm. Default is BFGS verbose A integer indicating how much output to display during the estimation process. <0 - No ouput 0 - Basic output (model estimates) 1 - Moderate output, basic ouput + parameter and likelihood in each iteration 2 - Extensive output, moderate output + gradient values on each call maxIter max iterations for EM algorithm tol tolerance for convergence of EM algorithm tol_LL tolerance for convergence of likelihood

### Value

A list containing the results of the estimated model, some of which are inherited from the return of maxLik

• estimates: Model estimates with 95% confidence intervals

• estimate or par: Point estimates

• variance_type: covariance matrix used to calculate standard errors. Either BHHH or Hessian.

• var: covariance matrix

• se: standard errors

• hessian: Hessian matrix at maximum

• gtHg: g'H^-1g, where H^-1 is simply the covariance matrix. A value close to zero (e.g., <1e-3 or 1e-6) indicates good convergence.

• LL or maximum: Likelihood

• AIC: AIC

• BIC: BIC

• n_obs: Number of observations

• n_par: Number of parameters

• iter: number of iterations taken to converge

• message: Message regarding convergence status.

Note that the list inherits all the components in the output of maxLik. See the documentation of maxLik for more details.

### References

Peng, Jing. (2023) Identification of Causal Mechanisms from Randomized Experiments: A Framework for Endogenous Mediation Analysis. Information Systems Research, 34(1):67-84. Available at https://doi.org/10.1287/isre.2022.1113

Other endogeneity: bilinear(), biprobit_latent(), biprobit_partial(), biprobit(), linear_probit(), pln_linear(), pln_probit(), probit_linearRE(), probit_linear_partial(), probit_linear()

### Examples


library(MASS)
N = 2000
rho = -0.5
set.seed(1)

x = rbinom(N, 1, 0.5)
z = rnorm(N)

e = mvrnorm(N, mu=c(0,0), Sigma=matrix(c(1,rho,rho,1), nrow=2))
e1 = e[,1]
e2 = e[,2]

m = as.numeric(1 + x + z + e1 > 0)
y = 1 + x + z + m + e2
est = probit_linear(m~x+z, y~x+z+m)
print(est$estimates, digits=3) est_latent = probit_linear_latent(~x+z, y~x+z) print(est_latent$estimates, digits=3)



endogeneity documentation built on Aug. 21, 2023, 9:11 a.m.