RDHonest: Honest inference in RD

Description Usage Arguments Details Value Note See Also Examples

View source: R/RD_lp.R


Calculate estimators and one- and two-sided CIs based on local polynomial estimator in RD under second-order Taylor or Hölder smoothness class. If kern="optimal", calculate optimal estimators under second-order Taylor smoothness class.


RDHonest(formula, data, subset, cutoff = 0, M, kern = "triangular",
  na.action, opt.criterion, bw.equal = TRUE, hp, hm = hp,
  se.method = "nn", alpha = 0.05, beta = 0.8, J = 3, sclass = "H",
  order = 1, se.initial = "IKEHW")



object of class "formula" (or one that can be coerced to that class) of the form outcome ~ running_variable


optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the outcome and running variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which the function is called.


optional vector specifying a subset of observations to be used in the fitting process.


specifies the RD cutoff in the running variable.


Bound on second derivative of the conditional mean function.


specifies kernel function used in the local regression. It can either be a string equal to "triangular" (k(u)=(1-|u|)_{+}), "epanechnikov" (k(u)=(3/4)(1-u^2)_{+}), or "uniform" (k(u)= (|u|<1)/2), or else a kernel function.


function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options (usually na.omit).


Optimality criterion that bandwidth is designed to optimize. It can either be based on exact finite-sample maximum bias and finite-sample estimate of variance, or asymptotic approximations to the bias and variance. The options are:


Finite-sample maximum MSE


Length of (fixed-length) two-sided confidence intervals.


Given quantile of excess length of one-sided confidence intervals

The finite-sample methods use conditional variance given by sigma2, if supplied. Otherwise, for the purpose of estimating the optimal bandwidth, conditional variance is assumed homoscedastic, and estimated using a nearest neighbor estimator.


logical specifying whether bandwidths on either side of cutoff should be constrainted to equal to each other.

hp, hm

bandwidth for treated (units with positive running variable), and control (units with negative running variable) units. If hm is not supplied, it is assumed to equal to hp. If neither bandwidth is supplied, optimal bandwidth is computed according to criterion given by opt.criterion.


Vector with methods for estimating standard error of estimate. If NULL, standard errors are not computed. The elements of the vector can consist of the following methods:


Nearest neighbor method


Eicker-Huber-White, with residuals from local regression (local polynomial estimators only).


Use EHW, but instead of using residuals, estimate sigma^2_i by subtracting the estimated intercept from the outcome (and not subtracting the estimated slope). Local polynomial estimators only.


Plug-in estimate based on asymptotic variance. Local polynomial estimators in RD only.


Use conditional variance supplied by sigma2 / d instead of computing residuals


determines confidence level, 1-alpha for constructing/optimizing confidence intervals.


Determines quantile of excess length to optimize, if bandwidth optimizes given quantile of excess length of one-sided confidence intervals.


Number of nearest neighbors, if "nn" is specified in se.method.


Smoothness class, either "T" for Taylor or "H" for Hölder class.


Order of local regression 1 for linear, 2 for quadratic.


Method for estimating initial variance for computing optimal bandwidth. Ignored if data already contains estimate of variance.


Based on residuals from a local linear regression using a triangular kernel and IK bandwidth


Based on sum of squared deviations of outcome from estimate of intercept in local linear regression with triangular kernel and IK bandwidth


Use residuals from local constant regression with uniform kernel and bandwidth selected using Silverman's rule of thumb, as in Equation (14) in IK


Use nearest neighbor estimates, rather than residuals


Use nearest neighbor estimates, without assuming homoscedasticity


The bandwidth is calculated to be optimal for a given performance criterion, as specified by opt.criterion. For local polynomial estimators, this optimal bandwidth is calculated using the function RDOptBW. Alternatively, for local polynomial estimators, the bandwidths above and below the cutoff can be specified by hp and hm.


Returns an object of class "RDResults". The function print can be used to obtain and print a summary of the results. An object of class "RDResults" is a list containing the following components


Point estimate. This estimate is MSE-optimal if opt.criterion="MSE"


Least favorable function, only relevant for optimal estimator under Taylor class.


Maximum bias of estimate


Standard deviation of estimate

lower, upper

Lower (upper) end-point of a one-sided CI based on estimate. This CI is optimal if opt.criterion=="OCI"


Half-length of a two-sided CI based on estimate, so that the CI is given by c(estimate-hl, estimate+hl). The CI is optimal if opt.criterion="FLCI"


Effective number of observations used by estimate

hp, hm

Bandwidths used


Coverage of CI that ignores bias and uses qnorm(1-alpha/2) as critical value


the matched call


subset is evaluated in the same way as variables in formula, that is first in data and then in the environment of formula.

See Also



# Lee dataset
RDHonest(voteshare ~ margin, data = lee08, kern = "uniform", M = 0.1,
         hp = 10, sclass = "T")

kolesarm/RDHonest documentation built on April 3, 2018, 11:08 a.m.