optrdd: Optimized regression discontinuity design

View source: R/optrdd.R

optrddR Documentation

Optimized regression discontinuity design

Description

Optimized estimation and inference of treamtment effects identified via regression discontinuities

Usage

optrdd(X, Y = NULL, W, max.second.derivative, estimation.point = NULL,
  sigma.sq = NULL, alpha = 0.95, lambda.mult = 1, bin.width = NULL,
  num.bucket = NULL, use.homoskedatic.variance = FALSE, use.spline = TRUE,
  spline.df = NULL, try.elnet.for.sigma.sq = FALSE, optimizer = c("auto",
  "mosek", "ECOS", "quadprog", "SCS"), verbose = TRUE)

Arguments

X

The running variables.

Y

The outcomes. If null, only optimal weights are computed.

W

Treatment assignments, typically of the form 1(X >= c).

max.second.derivative

A bound on the second derivative of mu_w(x) = E[Y(w) | X = x].

estimation.point

Point "c" at which CATE is to be estimated. If estimation.point = NULL, we estimate a weighted CATE, with weights chosen to minimize MSE, as in Section 4.1 of Imbens and Wager (2017).

sigma.sq

The irreducible noise level. If null, estimated from the data.

alpha

Coverage probability of confidence intervals.

lambda.mult

Optional multplier that can be used to over- or under-penalize variance.

bin.width

Bin width for discrete approximation.

num.bucket

Number of bins for discrete approximation. Can only be used if bin.width = NULL.

use.homoskedatic.variance

Whether confidence intervals should be built assuming homoskedasticity.

use.spline

Whether non-parametric components should be modeled as quadratic splines in order to reduce the number of optimization parameters, and potentially improving computational performance.

spline.df

Number of degrees of freedom (per running variable) used for spline computation.

try.elnet.for.sigma.sq

Whether an elastic net on a spline basis should be used for estimating sigma^2.

optimizer

Which optimizer to use? Mosek is a commercial solver, but free academic licenses are available. Needs to be installed separately. ECOS is an open-source interior-point solver for conic problems, made available via the CVXR wrapper. Quadprog is the default R solver; it may be slow on large problems, but is very accurate on small problems. SCS is an open-source "operator splitting" solver that implements a first order method for solving very large cone programs to modest accuracy. The speed of SCS may be helpful for prototyping; however, the results may be noticeably less accurate. SCS is also accessed via the CVXR wrapper. The option "auto" uses a heuristic to choose.

verbose

whether the optimizer should print progress information

Value

A trained optrdd object.

References

Domahidi, A., Chu, E., & Boyd, S. (2013, July). ECOS: An SOCP solver for embedded systems. In Control Conference (ECC), 2013 European (pp. 3071-3076). IEEE.

Imbens, G., & Wager, S. (2017). Optimized Regression Discontinuity Designs. arXiv preprint arXiv:1705.01677.

O’Donoghue, B., Chu, E., Parikh, N., & Boyd, S. (2016). Conic optimization via operator splitting and homogeneous self-dual embedding. Journal of Optimization Theory and Applications, 169(3), 1042-1068.

Examples

# Simple regression discontinuity with discrete X
n = 4000; threshold = 0
X = sample(seq(-4, 4, by = 8/41.5), n, replace = TRUE)
W = as.numeric(X >= threshold)
Y = 0.4 * W + 1 / (1 + exp(2 * X)) + 0.2 * rnorm(n)
# using 0.4 for max.second.derivative would have been enough
out.1 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = threshold)
print(out.1); plot(out.1, xlim = c(-1.5, 1.5))

# Now, treatment is instead allocated in a neighborhood of 0
thresh.low = -1; thresh.high = 1
W = as.numeric(thresh.low <= X & X <= thresh.high)
Y = 0.2 * (1 + X) * W + 1 / (1 + exp(2 * X)) + rnorm(n)
# This estimates CATE at specifically chosen points
out.2 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = thresh.low)
print(out.2); plot(out.2, xlim = c(-2.5, 2.5))
out.3 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5, estimation.point = thresh.high)
print(out.3); plot(out.3, xlim = c(-2.5, 2.5))
# This estimates a weighted CATE, with lower variance
out.4 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 0.5)
print(out.4); plot(out.4, xlim = c(-2.5, 2.5))

## Not run: 
# RDD with multivariate running variable.
X = matrix(runif(n*2, -1, 1), n, 2)
W = as.numeric(X[,1] < 0 | X[,2] < 0)
Y = X[,1]^2/3 + W * (1 + X[,2]) + rnorm(n)
out.5 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 1)
print(out.5); plot(out.5)
out.6 = optrdd(X=X, Y=Y, W=W, max.second.derivative = 1, estimation.point = c(0, 0.5))
print(out.6); plot(out.6)
## End(Not run)


swager/optrdd documentation built on Dec. 15, 2022, 4:34 a.m.