In ChenyueLiu/ivqr: Perform inverse quantile regression by solving mixed integer linear programming

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width=9,
  fig.height=3.6
)
require(ivqr)

Usage Demonstration

Data and Background

We will use a subsample of r nrow(angrist1993) observations from the data used in Angrist(1993). The original data soviii_ang93b.tab can be found here. We adopt the same treatment as did Angrist1993_Table2.do, which can be found here. And we further drop the observations containing na.

The data is included with ivqr and has the following structure, lnearn is the logarithm of the respondent's earning in 1986, vietnam, age, yrsrv, drafted, officer, curmar, nonwhite are control variables capturing the personal characteristics of the veteran; eddif is the grade increment since entering the military; anyva stands for whether VA education benefits were used by the respondent, which serves as an instrument for grade increment in Angrist(1993)

library(ivqr)
knitr::kable(head(angrist1993, n = 10))

We generate the subsample using the command angrist by interacting the instrument variable anyva, and endogenous variable eddif with control variables.

n=100
pD=3
angrist_data=angrist(n,pD)
sim_data<-data.frame(angrist_data$Y,angrist_data$D,angrist_data$Z)
colnames(sim_data)<-c("Y",paste0("D",1:pD),paste0("Z",1:pD))
knitr::kable(head(sim_data))

Alternatively, we can also use chen_lee to generated sample data following the simulation design by Chen and Lee(2018).

n=100
pD=3
chen_lee_data=chen_lee(n,pD)
sim_data<-data.frame(chen_lee_data$Y,chen_lee_data$D,chen_lee_data$Z)
colnames(sim_data)<-c("Y",paste0("D",1:pD),paste0("Z",1:pD))
knitr::kable(head(sim_data))

Syntax and Output Overview

Plotting $\beta(\tau$) vs $\tau$.

gen_result in the ivqr package will generate the graph for every B_D, with each graph plotting $\beta(\tau$) vs $\tau$.

angrist_data=angrist(n=100,pD=3)
#send a list containing Y, X, Z, D to gen_result()
data=list()
data$Y=angrist_data$Y
data$Z=angrist_data$Z
data$D=angrist_data$D
start_clock = Sys.time()  
#change this line to change the number of controls
#1:1 the first col; 1:7 all the 7 cols
#col 1: vietnam 0-1, col 2: age continuous, col 3: yrsrv continuous, col 4: drafted 0-1, 
#col 5: officer 0-1, col 6: curmar 0-1, col 7: nonwhite 0-1
data$X=angrist_data$X[,1:3,drop=FALSE]
reg_out=gen_result(data,"gurobi")
knitr::kable(head(reg_out))
print(paste0("time it takes ",difftime(Sys.time(),start_clock,units="mins")))

Here's what these requried parameters refer to:

n=50 is the number of observations in the dataset
pd=3 is the number of endogenous variables in the dataset

Confidence Intervals

ivqr_fit in the ivqr package will return the ivqr point estimation. Note: when homoscedastic=TRUE, the code fails to converge to a valid point estimation for a constant exogenous variable

n = 50
Z1 = rnorm(n)
Z2 = rnorm(n)
Z3 = rnorm(n)
Sigma = matrix(c(1, 0.4, 0.6, -0.2, 0.4, 1, 0, 0, 0.6, 0, 1, 0, -0.2, 0, 0, 1),
               4,
               4,
               byrow = TRUE)
EpsV1V2V3 = MASS::mvrnorm(n, mu = c(0, 0, 0, 0), Sigma = 0.25 * Sigma)
D1D2D3 = matrix(NA, n, 3)
for (i in 1:n) {
  D1D2D3[i, 1] = pnorm(Z1[i] + EpsV1V2V3[i, 2])
  D1D2D3[i, 2] = 2 * pnorm(Z2[i] + EpsV1V2V3[i, 3])
  D1D2D3[i, 3] = 1.5 * pnorm(Z3[i] + EpsV1V2V3[i, 4])
}
Y = 1 + D1D2D3[, 1] + D1D2D3[, 2] +  D1D2D3[, 3] + (0.5 + D1D2D3[, 1] + 0.25 *
                                                      D1D2D3[, 2] +  0.15 * D1D2D3[, 3]) * EpsV1V2V3[, 1]
#X = cbind(matrix(rnorm(3*n), n, 3),rep(1,n)) #matrix(1, n, 1) #
X = matrix(rnorm(3*n), n, 3)
Z = cbind(Z1, Z2, Z3)
D = D1D2D3
#Z = cbind(Z1, Z2)
#D = D1D2D3[,1:2]

#summary(rq(Y ~ D + X - 1))$coef[j, 2:3, drop = FALSE]
#save(list=c("Y","D","X","Z"),file=("quick_converge.Rdata"))
# fit_FALSE_G<-ivqr_fit (
#   Y,
#   D,
#   X,
#   Z,
#   tau = 0.5,
#   M = 10,
#   homoskedastic = FALSE,
#   kernel = "Gaussian",#"Powell"
#   bdwt = NULL,
#   scale_bdwt = 1
# )
# #
# fit_FALSE_P<-ivqr_fit (
#   Y,
#   D,
#   X,
#   Z,
#   tau = 0.5,
#   M = 10,
#   homoskedastic = FALSE,
#   kernel = "Powell",
#   bdwt = NULL,
#   scale_bdwt = 1
# )

fit_TRUE<-ivqr::ivqr_fit (
  Y,
  D,
  X,
  Z,
  tau = 0.5,
  M = 10,
  homoskedastic = TRUE,
  kernel = "Gaussian",#"Powell"
  bdwt = NULL,
  scale_bdwt = 1,
  lpsolver="gurobi"
)
print(fit_TRUE)

summary(fit_TRUE)

Here's what these requried parameters refer to:

TBA

ChenyueLiu/ivqr documentation built on Aug. 9, 2020, 7:49 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ChenyueLiu/ivqr
Perform inverse quantile regression by solving mixed integer linear programming

In ChenyueLiu/ivqr: Perform inverse quantile regression by solving mixed integer linear programming

Usage Demonstration

Data and Background

Syntax and Output Overview

Plotting $\beta(\tau$) vs $\tau$.

Confidence Intervals

R Package Documentation

Browse R Packages

We want your feedback!

ChenyueLiu/ivqr Perform inverse quantile regression by solving mixed integer linear programming

In ChenyueLiu/ivqr: Perform inverse quantile regression by solving mixed integer linear programming

Usage Demonstration

Data and Background

Syntax and Output Overview

Plotting $\beta(\tau$) vs $\tau$.

Confidence Intervals

R Package Documentation

Browse R Packages

We want your feedback!

ChenyueLiu/ivqr
Perform inverse quantile regression by solving mixed integer linear programming