# rdplotdensity: Density Plotting for Manipulation Testing In rddensity: Manipulation Testing Based on Density Discontinuity

 rdplotdensity R Documentation

## Density Plotting for Manipulation Testing

### Description

`rdplotdensity` constructs density plots. It is based on the local polynomial density estimator proposed in Cattaneo, Jansson and Ma (2020, 2023). A companion `Stata` package is described in Cattaneo, Jansson and Ma (2018).

Companion command: `rddensity` for manipulation (density discontinuity) testing.

Related Stata and R packages useful for inference in regression discontinuity (RD) designs are described in the website: https://rdpackages.github.io/.

### Usage

```rdplotdensity(
rdd,
X,
plotRange = NULL,
plotN = 10,
plotGrid = c("es", "qs"),
alpha = 0.05,
type = NULL,
lty = NULL,
lwd = NULL,
lcol = NULL,
pty = NULL,
pwd = NULL,
pcol = NULL,
CItype = NULL,
CIuniform = FALSE,
CIsimul = 2000,
CIcol = NULL,
bwselect = NULL,
hist = TRUE,
histBreaks = NULL,
histFillCol = 3,
histLineCol = "white",
title = "",
xlabel = "",
ylabel = "",
legendTitle = NULL,
legendGroups = NULL,
noPlot = FALSE
)
```

### Arguments

 `rdd` Object returned by `rddensity` `X` Numeric vector or one dimensional matrix/data frame, the running variable. `plotRange` Numeric, specifies the lower and upper bound of the plotting region. Default is `[c-3*hl,c+3*hr]` (three bandwidths around the cutoff). `plotN` Numeric, specifies the number of grid points used for plotting on the two sides of the cutoff. Default is `c(10,10)` (i.e., 10 points are used on each side). `plotGrid` String, specifies how the grid points are positioned. Options are `es` (evenly spaced) and `qs` (quantile spaced). `alpha` Numeric scalar between 0 and 1, the significance level for plotting confidence regions. If more than one is provided, they will be applied to the two sides accordingly. `type` String, one of `"line"` (default), `"points"` or `"both"`, how the point estimates are plotted. If more than one is provided, they will be applied to the two sides accordingly. `lty` Line type for point estimates, only effective if `type` is `"line"` or `"both"`. `1` for solid line, `2` for dashed line, `3` for dotted line. For other options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `lwd` Line width for point estimates, only effective if `type` is `"line"` or `"both"`. Should be strictly positive. For other options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `lcol` Line color for point estimates, only effective if `type` is `"line"` or `"both"`. `1` for black, `2` for red, `3` for green, `4` for blue. For other options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `pty` Scatter plot type for point estimates, only effective if `type` is `"points"` or `"both"`. For options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `pwd` Scatter plot size for point estimates, only effective if `type` is `"points"` or `"both"`. Should be strictly positive. If more than one is provided, they will be applied to the two sides accordingly. `pcol` Scatter plot color for point estimates, only effective if `type` is `"points"` or `"both"`. `1` for black, `2` for red, `3` for green, `4` for blue. For other options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `CItype` String, one of `"region"` (shaded region, default), `"line"` (dashed lines), `"ebar"` (error bars), `"all"` (all of the previous) or `"none"` (no confidence region), how the confidence region should be plotted. If more than one is provided, they will be applied to the two sides accordingly. `CIuniform` `TRUE` or `FALSE` (default), plotting either pointwise confidence intervals (`FALSE`) or uniform confidence bands (`TRUE`). `CIsimul` Positive integer, the number of simulations used to construct critical values (default is 2000). This option is ignored if `CIuniform=FALSE`. `CIshade` Numeric, opaqueness of the confidence region, should be between 0 (transparent) and 1. Default is 0.2. If more than one is provided, they will be applied to the two sides accordingly. `CIcol` Color of the confidence region. `1` for black, `2` for red, `3` for green, `4` for blue. For other options, see the instructions for `ggplot2` or `par`. If more than one is provided, they will be applied to the two sides accordingly. `bwselect` String, the method for data-driven bandwidth selection. Available options are (1) `"mse-dpi"` (mean squared error-optimal bandwidth selected for each grid point); (2) `"imse-dpi"` (integrated MSE-optimal bandwidth, common for all grid points); (3) `"mse-rot"` (rule-of-thumb bandwidth with Gaussian reference model); and (4) `"imse-rot"` (integrated rule-of-thumb bandwidth with Gaussian reference model). If omitted, bandwidths returned by `rddensity` will be used. `hist` `TRUE` (default) or `FALSE`, whether adding a histogram to the background. `histBreaks` Numeric vector, giving the breakpoints between histogram cells. `histFillCol` Color of the histogram cells. `histFillShade` Opaqueness of the histogram cells, should be between 0 (transparent) and 1. Default is 0.2. `histLineCol` Color of the histogram lines. `title, xlabel, ylabel` Strings, title of the plot and labels for x- and y-axis. `legendTitle` String, title of legend. `legendGroups` String Vector, group names used in legend. `noPlot` No density plot will be generated if set to `TRUE`.

### Details

Bias correction is only used for the construction of confidence intervals/bands, but not for point estimation. The point estimates, denoted by `f_p`, are constructed using local polynomial estimates of order `p`, while the centering of the confidence intervals/bands, denoted by `f_q`, are constructed using local polynomial estimates of order `q`. The confidence intervals/bands take the form: `[f_q - cv * SE(f_q) , f_q + cv * SE(f_q)]`, where `cv` denotes the appropriate critical value and `SE(f_q)` denotes a standard error estimate for the centering of the confidence interval/band. As a result, the confidence intervals/bands may not be centered at the point estimates because they have been bias-corrected. Setting `q` and `p` to be equal results on centered at the point estimate confidence intervals/bands, but requires undersmoothing for valid inference (i.e., (I)MSE-optimal bandwdith for the density point estimator cannot be used). Hence the bandwidth would need to be specified manually when `q=p`, and the point estimates will not be (I)MSE optimal. See Cattaneo, Jansson and Ma (2022, 2023) for details, and also Calonico, Cattaneo, and Farrell (2018, 2022) for robust bias correction methods.

Sometimes the density point estimates may lie outside of the confidence intervals/bands, which can happen if the underlying distribution exhibits high curvature at some evaluation point(s). One possible solution in this case is to increase the polynomial order `p` or to employ a smaller bandwidth.

### Value

 `Estl, Estr` Matrices containing estimation results: (1) `grid` (grid points), (2) `bw` (bandwidths), (3) `nh` (number of observations in each local neighborhood), (4) `nhu` (number of unique observations in each local neighborhood), (5) `f_p` (point estimates with p-th order local polynomial), (6) `f_q` (point estimates with q-th order local polynomial, only if option `q` is nonzero), (7) `se_p` (standard error corresponding to `f_p`), and (8) `se_q` (standard error corresponding to `f_q`). Variance-covariance matrix corresponding to `f_p`. Variance-covariance matrix corresponding to `f_q`. A list containing options passed to the function. `Estplot` A stadnard `ggplot` object is returned, hence can be used for further customization.

### Author(s)

Matias D. Cattaneo, Princeton University cattaneo@princeton.edu.

Michael Jansson, University of California Berkeley. mjansson@econ.berkeley.edu.

Xinwei Ma (maintainer), University of California San Diego. x1ma@ucsd.edu.

### References

Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2018. On the Effect of Bias Estimation on Coverage Accuracy in Nonparametric Inference. Journal of the American Statistical Association 113(522): 767-779. doi: 10.1080/01621459.2017.1285776

Calonico, S., M. D. Cattaneo, and M. H. Farrell. 2022. Coverage Error Optimal Confidence Intervals for Local Polynomial Regression. Bernoulli, 28(4): 2998-3022. doi: 10.3150/21-BEJ1445

Cattaneo, M. D., M. Jansson, and X. Ma. 2018. Manipulation Testing based on Density Discontinuity. Stata Journal 18(1): 234-261. doi: 10.1177/1536867X1801800115

Cattaneo, M. D., M. Jansson, and X. Ma. 2020. Simple Local Polynomial Density Estimators. Journal of the American Statistical Association, 115(531): 1449-1455. doi: 10.1080/01621459.2019.1635480

Cattaneo, M. D., M. Jansson, and X. Ma. 2022. lpdensity: Local Polynomial Density Estimation and Inference. Journal of Statistical Software, 101(2), 1–25. doi: 10.18637/jss.v101.i02

Cattaneo, M. D., M. Jansson, and X. Ma. 2023. Local Regression Distribution Estimators. Journal of Econometrics, forthcoming. doi: 10.1016/j.jeconom.2021.01.006

`rddensity`

### Examples

```# Generate a random sample with a density discontinuity at 0
set.seed(42)
x <- rnorm(2000, mean = -0.5)
x[x > 0] <- x[x > 0] * 2

# Estimation
rdd <- rddensity(X = x)
summary(rdd)

# Density plot (from -2 to 2 with 25 evaluation points at each side)
plot1 <- rdplotdensity(rdd, x, plotRange = c(-2, 2), plotN = 25)

# Plotting a uniform confidence band
set.seed(42) # fix the seed for simulating critical values
plot3 <- rdplotdensity(rdd, x, plotRange = c(-2, 2), plotN = 25, CIuniform = TRUE)

```

rddensity documentation built on Jan. 22, 2023, 1:26 a.m.