lpbwdensity: Data-driven Bandwidth Selection for Local Polynomial Density...

View source: R/lpbwdensity.R

lpbwdensityR Documentation

Data-driven Bandwidth Selection for Local Polynomial Density Estimators

Description

lpbwdensity implements the bandwidth selection methods for local polynomial based density (and derivatives) estimation proposed and studied in Cattaneo, Jansson and Ma (2020, 2023). See Cattaneo, Jansson and Ma (2022) for more implementation details and illustrations.

Companion command: lpdensity for estimation and robust bias-corrected inference.

Related Stata and R packages useful for nonparametric estimation and inference are available at https://nppackages.github.io/.

Usage

lpbwdensity(
  data,
  grid = NULL,
  p = NULL,
  v = NULL,
  kernel = c("triangular", "uniform", "epanechnikov"),
  bwselect = c("mse-dpi", "imse-dpi", "mse-rot", "imse-rot"),
  massPoints = TRUE,
  stdVar = TRUE,
  regularize = TRUE,
  nLocalMin = NULL,
  nUniqueMin = NULL,
  Cweights = NULL,
  Pweights = NULL
)

Arguments

data

Numeric vector or one dimensional matrix/data frame, the raw data.

grid

Numeric, specifies the grid of evaluation points. When set to default, grid points will be chosen as 0.05-0.95 percentiles of the data, with a step size of 0.05.

p

Nonnegative integer, specifies the order of the local polynomial used to construct point estimates. (Default is 2.)

v

Nonnegative integer, specifies the derivative of the distribution function to be estimated. 0 for the distribution function, 1 (default) for the density funtion, etc.

kernel

String, specifies the kernel function, should be one of "triangular", "uniform" or "epanechnikov".

bwselect

String, specifies the method for data-driven bandwidth selection. This option will be ignored if bw is provided. Can be (1) "mse-dpi" (default, mean squared error-optimal bandwidth selected for each grid point); or (2) "imse-dpi" (integrated MSE-optimal bandwidth, common for all grid points); (3) "mse-rot" (rule-of-thumb bandwidth with Gaussian reference model); and (4) "imse-rot" (integrated rule-of-thumb bandwidth with Gaussian reference model).

massPoints

TRUE (default) or FALSE, specifies whether point estimates and standard errors should be adjusted if there are mass points in the data.

stdVar

TRUE (default) or FALSE, specifies whether the data should be standardized for bandwidth selection.

regularize

TRUE (default) or FALSE, specifies whether the bandwidth should be regularized. When set to TRUE, the bandwidth is chosen such that the local region includes at least nLocalMin observations and at least nUniqueMin unique observations.

nLocalMin

Nonnegative integer, specifies the minimum number of observations in each local neighborhood. This option will be ignored if regularize=FALSE. Default is 20+p+1.

nUniqueMin

Nonnegative integer, specifies the minimum number of unique observations in each local neighborhood. This option will be ignored if regularize=FALSE. Default is 20+p+1.

Cweights

Numeric vector, specifies the weights used for counterfactual distribution construction. Should have the same length as the data. This option will be ignored if bwselect is "mse-rot" or "imse-rot".

Pweights

Numeric vector, specifies the weights used in sampling. Should have the same length as the data. This option will be ignored if bwselect is "mse-rot" or "imse-rot".

Value

BW

A matrix containing (1) grid (grid point), (2) bw (bandwidth), (3) nh (number of observations in each local neighborhood), and (4) nhu (number of unique observations in each local neighborhood).

opt

A list containing options passed to the function.

Author(s)

Matias D. Cattaneo, Princeton University. cattaneo@princeton.edu.

Michael Jansson, University of California Berkeley. mjansson@econ.berkeley.edu.

Xinwei Ma (maintainer), University of California San Diego. x1ma@ucsd.edu.

References

Cattaneo, M. D., M. Jansson, and X. Ma. 2020. Simple Local Polynomial Density Estimators. Journal of the American Statistical Association, 115(531): 1449-1455. doi: 10.1080/01621459.2019.1635480

Cattaneo, M. D., M. Jansson, and X. Ma. 2022. lpdensity: Local Polynomial Density Estimation and Inference. Journal of Statistical Software, 101(2), 1–25. doi: 10.18637/jss.v101.i02

Cattaneo, M. D., M. Jansson, and X. Ma. 2023. Local Regression Distribution Estimators. Journal of Econometrics, forthcoming. doi: 10.1016/j.jeconom.2021.01.006

See Also

Supported methods: coef.lpbwdensity, print.lpbwdensity, summary.lpbwdensity.

Examples

# Generate a random sample
set.seed(42); X <- rnorm(2000)

# Construct bandwidth
bw1 <- lpbwdensity(X)
summary(bw1)

# Display bandwidths for a subset of grid points
summary(bw1, grid=bw1$BW[4:10, "grid"])
summary(bw1, gridIndex=4:10)


lpdensity documentation built on Jan. 22, 2023, 1:39 a.m.