dfldens: Counterfactual Kernel Density Functions

Description Usage Arguments Details Value References See Also Examples

View source: R/dfldens.R

Description

Uses the DiNardo, Fortin, and Lemieux approach to re-weight kernel density functions based on values of an explanatory variable from an earlier period.

Usage

1
2
3
 
dfldens(y,lgtform,window=0,bandwidth=0,kern="tcub",probit=FALSE,
  graph=TRUE,yname="y",alldata=FALSE,data=NULL)

Arguments

y

The dependent variable for which the counterfactual density is estimated. The data frame must be specified if it has not been attached, e.g., y=mydata$depvar.

lgtform

The formula for the logit or probit model for the time variable. The dependent variable should be a 0-1 variable with 1's representing the later time period. Example: lgtform=timevar~x1+x2.

window

The window size for the kernel density function. Default: not used.

bandwidth

The bandwidth. Default: bandwidth = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)), specified by setting bandwidth = 0 and window = 0.

kern

Kernel weighting function. Default is the tri-cube. Options include "rect", "tria", "epan", "bisq", "tcub", "trwt", and "gauss".

probit

If TRUE, a probit model is used for the time variable rather than logit. Default: probit = FALSE.

graph

If TRUE, produces a graph showing the density function for time 1 and the counterfactual density. Default: graph=TRUE.

yname

The name to be used for the variable whose density functions are drawn when graph=T. Default: yname = "y".

alldata

If TRUE, the density functions are calculated using each observation in turn as a target value. When alldata=F, densities are calculated at a set of points chosen by the locfit program using an adaptive decision tree approach, and the smooth12 command is used to interpolate to the full set of observations.

data

A data frame with the variables for the logit or probit model specified by lgtform. Note: the data frame for y must be specified even if it is part of data.

Details

The dfldens command first calculates kernel density estimates for y in time period timevar = 1. The density estimate at target point y is f(y_1) = (1/(hn_1)) ∑_i K((y_{1i} - y_1)/h). The following kernel weighting functions are available:

Kernel Call abbreviation Kernel function K(z)
Rectangular ``rect'' 1/2 * I(|z|<1)
Triangular ``tria'' (1-|z|) * I(|z|<1)
Epanechnikov ``epan'' 3/4 * (1-z^2)*I(|z| < 1)
Bi-Square ``bisq'' 15/16 * (1-z^2)^2 * I(|z| < 1)
Tri-Cube ``tcub'' 70/81 * (1-|z|^3)^3 * I(|z| < 1)
Tri-Weight ``trwt'' 35/32 * (1-z^2)^3 * I(|z| < 1)
Gaussian ``gauss'' 2pi^{-.5} exp(-z^2/2)

By default, dfldens uses a tri-cube kernel with a fixed bandwidth of h = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)). The results are stored in dtarget1 and dhat1.

The counterfactual density is an estimate of the density function for y in time 1 if the explanatory variables listed in lgtform were equal to their time 0 values. DiNardo, Fortin, and Lemieux (1996) show that the the following re-weighting of f(y_1) is an estimate of the counterfactual density: (1/(hn_1)) ∑_i τ_i K((y_{1i} - y_1)/h). The weights are given by tau_i = (P(x_i)/(1-P(x_i)))/(p/(1-p)) , where p = n_0/(n_0 + n_1)) and P(x_i)) is the estimated probability that timevar = 0 from the estimated logit or probit regression of timevar on X.

If X includes a single variable x, the counterfactual density shows how the f(y_1) would change if x = x_0 rather than x_1. Alternatively, X can include multiple variables, in which case the counterfactual density shows how the f(y_1) would change if all of the variables in X were equal to their timevar = 0 values.

Value

target

The vector of target values for y for the density functions.

dtarget1

The vector of densities in period 1 at the target values of y.

dtarget10

The counterfactual densities in period 1 at the target values of y.

dhat1

The vector of densities in period 1 at the actual values of y.

dhat10

The counterfactual densities in period 1 at the actual values of y.

References

DiNardo, J., N. Fortin, and T. Lemieux, "Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semi-Parametric Approach," Econometrica 64 (1996), 1001-1044.

Leibbrandt, Murray, James A. Levinsohn, and Justin McCrary, "Incomes in South Africa after the Fall of Apartheid," Journal of Globalization and Development 1 (2010).

See Also

qregsim2

Examples

1
2
3
4
5
6
7
data(matchdata)
matchdata$year05 <- matchdata$year==2005
fit <- dfldens(matchdata$lnprice, year05~lnland+lnbldg, window=.2, 
  yname = "Log of Sale Price", data=matchdata)
matchdata$age <- matchdata$year - matchdata$yrbuilt
fit <- dfldens(matchdata$lnprice, year05~age, window=.2, 
  yname="Log of Sale Price", data=matchdata)

McSpatial documentation built on May 2, 2019, 9:32 a.m.