dfldens: Counterfactual Kernel Density Functions
In McSpatial: Nonparametric spatial data analysis

Description Usage Arguments Details Value References See Also Examples

Uses the DiNardo, Fortin, and Lemieux approach to re-weight kernel density functions based on values of an explanatory variable from an earlier period.

1
2
3

 
dfldens(y,lgtform,window=0,bandwidth=0,kern="tcub",probit=FALSE,
  graph=TRUE,yname="y",alldata=FALSE,data=NULL)

`y`	The dependent variable for which the counterfactual density is estimated. The data frame must be specified if it has not been attached, e.g., y=mydata$depvar.
`lgtform`	The formula for the logit or probit model for the time variable. The dependent variable should be a 0-1 variable with 1's representing the later time period. Example: lgtform=timevar~x1+x2.
`window`	The window size for the kernel density function. Default: not used.
`bandwidth`	The bandwidth. Default: bandwidth = (.9(quantile(y1,.75)-quantile(y1,.25))/1.34)(n1^(-.20)), specified by setting bandwidth = 0 and window = 0.
`kern`	Kernel weighting function. Default is the tri-cube. Options include "rect", "tria", "epan", "bisq", "tcub", "trwt", and "gauss".
`probit`	If TRUE, a probit model is used for the time variable rather than logit. Default: probit = FALSE.
`graph`	If TRUE, produces a graph showing the density function for time 1 and the counterfactual density. Default: graph=TRUE.
`yname`	The name to be used for the variable whose density functions are drawn when graph=T. Default: yname = "y".
`alldata`	If TRUE, the density functions are calculated using each observation in turn as a target value. When alldata=F, densities are calculated at a set of points chosen by the locfit program using an adaptive decision tree approach, and the smooth12 command is used to interpolate to the full set of observations.
`data`	A data frame with the variables for the logit or probit model specified by lgtform. Note: the data frame for y must be specified even if it is part of data.

The dfldens command first calculates kernel density estimates for y in time period timevar = 1. The density estimate at target point y is f(y_1) = (1/(hn_1)) ∑_i K((y_{1i} - y_1)/h). The following kernel weighting functions are available:

Kernel	Call abbreviation	Kernel function K(z)
Rectangular	``rect''	1/2 I(\|z\|<1)*
Triangular	``tria''	(1-\|z\|) I(\|z\|<1)*
Epanechnikov	``epan''	3/4 (1-z^2)I(\|z\| < 1)
Bi-Square	``bisq''	15/16 (1-z^2)^2 * I(\|z\| < 1)*
Tri-Cube	``tcub''	70/81 (1-\|z\|^3)^3 * I(\|z\| < 1)*
Tri-Weight	``trwt''	35/32 (1-z^2)^3 * I(\|z\| < 1)*
Gaussian	``gauss''	2pi^{-.5} exp(-z^2/2)

By default, dfldens uses a tri-cube kernel with a fixed bandwidth of h = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)). The results are stored in dtarget1 and dhat1.

The counterfactual density is an estimate of the density function for y in time 1 if the explanatory variables listed in lgtform were equal to their time 0 values. DiNardo, Fortin, and Lemieux (1996) show that the the following re-weighting of f(y_1) is an estimate of the counterfactual density: (1/(hn_1)) ∑_i τ_i K((y_{1i} - y_1)/h). The weights are given by tau_i = (P(x_i)/(1-P(x_i)))/(p/(1-p)) , where p = n_0/(n_0 + n_1)) and P(x_i)) is the estimated probability that timevar = 0 from the estimated logit or probit regression of timevar on X.

If X includes a single variable x, the counterfactual density shows how the f(y_1) would change if x = x_0 rather than x_1. Alternatively, X can include multiple variables, in which case the counterfactual density shows how the f(y_1) would change if all of the variables in X were equal to their timevar = 0 values.

`target`	The vector of target values for y for the density functions.
`dtarget1`	The vector of densities in period 1 at the target values of y.
`dtarget10`	The counterfactual densities in period 1 at the target values of y.
`dhat1`	The vector of densities in period 1 at the actual values of y.
`dhat10`	The counterfactual densities in period 1 at the actual values of y.

DiNardo, J., N. Fortin, and T. Lemieux, "Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semi-Parametric Approach," Econometrica 64 (1996), 1001-1044.

Leibbrandt, Murray, James A. Levinsohn, and Justin McCrary, "Incomes in South Africa after the Fall of Apartheid," Journal of Globalization and Development 1 (2010).

qregsim2

data(matchdata)
matchdata$year05 <- matchdata$year==2005
fit <- dfldens(matchdata$lnprice, year05~lnland+lnbldg, window=.2, 
  yname = "Log of Sale Price", data=matchdata)
matchdata$age <- matchdata$year - matchdata$yrbuilt
fit <- dfldens(matchdata$lnprice, year05~age, window=.2, 
  yname="Log of Sale Price", data=matchdata)