liv: Fitting Linear Models with one Endogenous Regressor using...

Description Usage Arguments Details Value Author(s) References Examples

Description

Fits linear models with one endogenous regressor and no additional explanatory variables using the latent instrumental variable approach presented in Ebbes,P., Wedel,M., B\"ockenholt, U., and Steerneman, A. G. M. (2005). This is a statistical technique to address the endogeneity problem where no external instrumental variables are needed. The important assumption of the model is that the latent variables are discrete with at least two groups with different means and the structural error is normally distributed.

Usage

1
liv(formula, param = NULL, data = NULL)

Arguments

formula

an object of type 'formula': a symbolic description of the model to be fitted. Example var1 ~ var2, where var1 is a vector containing the dependent variable, while var2 is a vector containing the endogenous variable.

param

a vector of initial values for the parameters of the model to be supplied to the optimization algorithm. In any model there are eight parameters. The first parameter is the intercept, then the coefficient of the endogenous variable followed by the means of the two groups of the latent IV (they need to be different, otherwise model is not identified), then the next three parameters are for the variance-covariance matrix. The last parameter is the probability of being in group 1. When not provided, initial paramameters values are set equal to the OLS coefficients, the two group means are set to be equal to mean(P) and mean(P) + sd(P), the variance-covariance matrix has all elements equal to 1 while probG1 is set to equal 0.5.

data

optional data frame or list containing the variables of the model.

Details

Let's consider the model:

Y_t = b0 + a * P_t + eps_t

P_t = pi * Z_t + nu_t

where t = 1,..,T indexes either time or cross-sectional units, Y_t is the dependent variable, P_t is a k x 1 continuous, endogenous regressor, ε_{t} is a structural error term with mean zero and E(eps^2) = sigma_eps^2, a and b0 are model parameters. Z_t is a l x 1 vector of instruments, and nu is a random error with mean zero and E(nu^2) = sigma_nu^2. The endogeneity problem arises from the correlation of P and eps through E(eps * nu) = sigma_0^2.

LIV considers Z_t to be a latent, discrete, exogenous variable with an unknown number of groups m and pi is a vector of group means. It is assumed that Z is independent of the error terms eps and nu and that it has at least two groups with different means. The structural and random errors are considered normally distributed with mean zero and variance-covariance matrix Sigma:

Sigma = (sigma_eps^2, sigma_0^2 ; sigma_0^2, sigma_nu^2)

The identification of the model lies in the assumption of the non-normality of P, the discreteness of the unobserved instruments and the existence of at least two groups with different means.

The method has been programmed such that the latent variable has two groups. Ebbes et al.(2005) show in a Monte Carlo experiement that even if the true number of the categories of the instrument is larger than two, LIV estimates are approximately consistent. Besides, overfitting in terms of the number of groups/categories reduces the degrees of freedom and leads to efficiency loss. When provided by the user, the initial parameter values for the two group means have to be different, otherwise the model is not identified. For a model with additonal explanatory variables a Bayesian approach is needed, since in a frequentist approach identification issues appear. The optimization algorithm used is BFGS.

Value

Returns the optimal values of the parameters as computed by maximum likelihood using BFGS algorithm.

coefficients

returns the value of the parameters for the intercept and the endogenous regressor as computed with maximum likelihood.

means

returns the value of the parameters for the means of the two categories/groups of the latent instrumental variable.

sigma

returns the variance-covariance matrix sigma, where on the main diagonal are the variances of the structural error and that of the endogenous regressor and the off-diagonal terms are equal to the covariance between the errors.

probG1

returns the probability of being in group one. Since the model assumes that the latent instrumental variable has two groups, 1-probG1 gives the probability of group 2.

value

the value of the log-likelihood function corresponding to the optimal parameters.

convcode

an integer code, the same as the output returned by optimx. 0 indicates successful completion. A possible error code is 1 which indicates that the iteration limit maxit had been reached.

hessian

a symmetric matrix giving an estimate of the Hessian at the solution found.

Author(s)

The implementation of the model formula by Raluca Gui based on the paper of Ebbes et al. (2005).

References

Ebbes, P., Wedel,M., B\"ockenholt, U., and Steerneman, A. G. M. (2005). 'Solving and Testing for Regressor-Error (in)Dependence When no Instrumental Variables are Available: With New Evidence for the Effect of Education on Income'. Quantitative Marketing and Economics, 3:365–392.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# load data
data(dataLIV)
y <- dataLIV$y
P <- dataLIV$P
# function call without any initial parameter values
l  <- liv(y ~ P)
summary(l)
# function call with initial parameter values given by the user
l1 <- liv(y ~ P, c(2.9,-0.85,0,0.1,1,1,1,0.5))
summary(l1)

Rgui/REndo_1.0 documentation built on May 9, 2019, 10:03 a.m.