optrdd: Optimized regression discontinuity design
In optrdd: Optimized Regression Discontinuity Designs

Description Usage Arguments Value References Examples

View source: R/optrdd.R

Optimized estimation and inference of treamtment effects identified via regression discontinuities

optrdd(X, Y = NULL, W, max.second.derivative, estimation.point = NULL,
  sigma.sq = NULL, alpha = 0.95, lambda.mult = 1, bin.width = NULL,
  num.bucket = NULL, use.homoskedatic.variance = FALSE, use.spline = TRUE,
  spline.df = NULL, try.elnet.for.sigma.sq = FALSE, optimizer = c("auto",
  "mosek", "ECOS", "quadprog", "SCS"), verbose = TRUE)

`X`	The running variables.
`Y`	The outcomes. If null, only optimal weights are computed.
`W`	Treatment assignments, typically of the form 1(X >= c).
`max.second.derivative`	A bound on the second derivative of mu_w(x) = E[Y(w) \| X = x].
`estimation.point`	Point "c" at which CATE is to be estimated. If estimation.point = NULL, we estimate a weighted CATE, with weights chosen to minimize MSE, as in Section 4.1 of Imbens and Wager (2017).
`sigma.sq`	The irreducible noise level. If null, estimated from the data.
`alpha`	Coverage probability of confidence intervals.
`lambda.mult`	Optional multplier that can be used to over- or under-penalize variance.
`bin.width`	Bin width for discrete approximation.
`num.bucket`	Number of bins for discrete approximation. Can only be used if bin.width = NULL.
`use.homoskedatic.variance`	Whether confidence intervals should be built assuming homoskedasticity.
`use.spline`	Whether non-parametric components should be modeled as quadratic splines in order to reduce the number of optimization parameters, and potentially improving computational performance.
`spline.df`	Number of degrees of freedom (per running variable) used for spline computation.
`try.elnet.for.sigma.sq`	Whether an elastic net on a spline basis should be used for estimating sigma^2.
`optimizer`	Which optimizer to use? Mosek is a commercial solver, but free academic licenses are available. Needs to be installed separately. ECOS is an open-source interior-point solver for conic problems, made available via the CVXR wrapper. Quadprog is the default R solver; it may be slow on large problems, but is very accurate on small problems. SCS is an open-source "operator splitting" solver that implements a first order method for solving very large cone programs to modest accuracy. The speed of SCS may be helpful for prototyping; however, the results may be noticeably less accurate. SCS is also accessed via the CVXR wrapper. The option "auto" uses a heuristic to choose.
`verbose`	whether the optimizer should print progress information

A trained optrdd object.

Domahidi, A., Chu, E., & Boyd, S. (2013, July). ECOS: An SOCP solver for embedded systems. In Control Conference (ECC), 2013 European (pp. 3071-3076). IEEE.

Imbens, G., & Wager, S. (2017). Optimized Regression Discontinuity Designs. arXiv preprint arXiv:1705.01677.

O’Donoghue, B., Chu, E., Parikh, N., & Boyd, S. (2016). Conic optimization via operator splitting and homogeneous self-dual embedding. Journal of Optimization Theory and Applications, 169(3), 1042-1068.

# Simple regression discontinuity with discrete X
n = 4000; threshold = 0
X = sample(seq(-4, 4, by = 8/41.5), n, replace = TRUE)
W = as.numeric(X >= threshold)
Y = 0.4 * W + 1 / (1 + exp(2 * X)) + 0.2 * rnorm(n)
# using 0.4 for max.second.derivative would have been enough
out.1 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = threshold)
print(out.1); plot(out.1, xlim = c(-1.5, 1.5))

# Now, treatment is instead allocated in a neighborhood of 0
thresh.low = -1; thresh.high = 1
W = as.numeric(thresh.low <= X & X <= thresh.high)
Y = 0.2 * (1 + X) * W + 1 / (1 + exp(2 * X)) + rnorm(n)
# This estimates CATE at specifically chosen points
out.2 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = thresh.low)
print(out.2); plot(out.2, xlim = c(-2.5, 2.5))
out.3 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = thresh.high)
print(out.3); plot(out.3, xlim = c(-2.5, 2.5))
# This estimates a weighted CATE, with lower variance
out.4 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5)
print(out.4); plot(out.4, xlim = c(-2.5, 2.5))

## Not run: 
# RDD with multivariate running variable.
X = matrix(runif(n*2, -1, 1), n, 2)
W = as.numeric(X[,1] < 0 | X[,2] < 0)
Y = X[,1]^2/3 + W * (1 + X[,2]) + rnorm(n)
out.5 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 1)
print(out.5); plot(out.5)
out.6 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 1, estimation.point = c(0, 0.5))
print(out.6); plot(out.6)
## End(Not run)