Description Usage Arguments Details Value Author(s) References Examples
Fits linear models with continuous or discrete endogeneous regressors using Gaussian copulas, method presented in Park and Gupta (2012). This is a statistical technique to address the endogeneity problem, where no external instrumental variables are needed. The important assumption of the model is that the endogeneous variables should NOT be normally distributed.
1 2 |
y |
the vector or matrix containing the dependent variable. |
X |
the data frame or matrix containing the regressors of the model, both exogeneous and endogeneous. The last column/s should contain the endogenous variable/s. |
P |
the matrix.vector containing the endogenous variables. |
param |
the vector of initial values for the parameters of the model to be supplied to the optimization algorithm. The parameters to be estimated are |
type |
the type of the endogenous regressor/s. It can take two values, "continuous" or "discrete". |
method |
the method used for estimating the model. It can take two values, |
intercept |
optional parameter. The model is estimated by default with
intercept. If no intercept is desired or the regressors matrix |
Park and Gupta (2012) proposed a method that allows for the joint estimation of the endogenous regressor and the error term in the structural equation using copulas. As LIV, the model parameters are estimated using maximum likelihood. The underlying idea is that, using information contained in the observed data, one selects marginal distributions for the endogenous regressor and the structural error. Then, the copula model enables the construction of a flexible multivariate joint distribution allowing a wide range of correlations between the two marginals. For the error, epsilon_t, the marginal distribution is assumed to be normal, while the marginal distribution of the endogenous regressor, P_t is obtained using the Epanechnikov kernel density estimator with the bandwidth equal to
b = 0.9 * T^(-1/5) * min(s,IQR/1.34)
, where IQR is the inter-quartile range while s is the data sample standard deviation. Following Sklar's theorem (Sklar, 1959), where H(P), G(epsilon) are the marginal distributions of the endogenous regressors and of the structural error, respectively, there exists a copula function C such that for all p and epsilon, the joint distribution function is:
F(p, epsilon) = C(H(p).G(epsilon)) = C(U_p, U_epsilon)
where U_p = H(p), U_epsilon = G(epsilon) are uniform (0,1) random variables. Then the joint density function is given by:
f(p, epsilon) = c(U_p, U_epsion)h(p)g(epsilon)
where
c(U_p, U_epsilon) = d^2(C)/d(p)d(epsilon)
. Using the Gaussian copula, the joint density function of P_{t}P_t and epsilon_t is:
f(p_{t},ε_{t}) =\frac{1}{(1-ρ^{2})^{1/2}}exp≤ft[\frac{-ρ^{2}(Φ^{-1}(U_{p,t})^{2}+Φ^{-1}(U_{ε,t})^{2})}{2(1-ρ^{2})}+\frac{ρΦ^{-1}(U_{p,t})Φ^{-1}(U_{ε,t})}{(1-ρ^{2})}\right]\\ \cdot h(p)\cdot g(ε_{t})
f(p_t,epsilon_t) = 1/sqrt(1- rho^2)exp[(-rho^2)*(Phi^(-1)((U_p)^2) + Phi^(-1)(U_epsilon^2))/(2*(1-rho^2)) + rho * Phi^(-1)(U_p) * Phi^(-1)(U_epsilon)/(1-rho^2)] Having the joint density function, the model's parameters Θ=\{α,β,σ_{ε},ρ \} are obtained by maximising the log-likelihood function. The maximum likelihood estimation is performed by the "BFGS" algorithm. When there are two endogenous regressors, there is no need for initial parameters since the method applied is by default the augmented OLS, which can be specified by using method two - "method=2.
Depending on the method and the type of the variables, it returns the optimal values of the parameters and their standard errors in the case of the second method.
With one endogenous variable, if the maximum likelihood approach is chosen, the standard errors can be computed by bootsptrapping using the boots
function from the same package.
The implementation of the model by Raluca Gui based on the paper of Park and Gupta (2012).
Park, S. and Gupta, S., (2012), 'Handling Endogeneous Regressors by Joint Estimation Using Copulas', Marketing Science, 31(4), 567-86.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #load dataset dataCopC1, where P is endogenous, continuous and not normally distributed
data(dataCopC1)
y <- dataCopC1[,1]
X <- dataCopC1[,2:5]
P <- dataCopC1[,5]
c1 <- copulaEndo(y, X, P, type = "continuous", method = 1, intercept="no")
c1
# to obtain the standard errors use the boots() function
# se.c1 <- boots(10, y, X, P, param = c(1,1,-2,-0.5,0.2,1), intercept= "no")
# an alternative model can be obtained using "method = 2".
c12 <- copulaEndo(y, X, P, type = "continuous", method = 2)
c1
# load datset with 2 continuous, non-normally distributed endogeneous regressors.
# with 2 endogenous regressors the default method is the augmented OLS.
#data(dataCopC2)
#y <- dataCopC2[,1]
#X <- dataCopC2[,2:6]
#P <- dataCopC2[,5:6]
#c2 <- copulaEndo(y, X, P, type = "continuous")
#summary(c2)
# load dataset with 1 discrete endogeneous variable.
# having more than 1 discrete endogenous regressor is also possible
data(dataCopDis)
y <- dataCopDis[,1]
X <- dataCopDis[,2:5]
P <- dataCopDis[,5]
c3 <- copulaEndo(y, X, P, type = "discrete", intercept=FALSE)
c3
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.