qregsim2: Machado-Mata Decomposition of Changes in Distributions
In McSpatial: Nonparametric spatial data analysis

Description Usage Arguments Details Value References See Also Examples

Decomposes quantile regression estimates of changes in the distribution of a dependent variable into the components associated with changes in the distribution of the explanatory variables and the coefficient estimates.

 
qregsim2(formall, formx, dataframe1, dataframe2, bmat1, bmat2,  
  graphx=TRUE, graphb=TRUE, graphy=TRUE, graphdy=TRUE, nbarplot=10, 
  yname=NULL, xnames=NULL, timenames=c("1","2"),
  leglocx="topright",leglocy="topright",leglocdy="topright",
  nsim=20000, bwadjx=1,bwadjy=1,bwadjdy=1)

`formall`	Model formula. Must match the model formula used for qregbmat.
`formx`	Model formula for the variables used for the decompositions, e.g., formx=~x1+x2. The coefficients and variables for the other variables are held at their time 2 values for the simulations.
`dataframe1`	The data frame for regime 1. Should include all the variables listed in formall.
`dataframe2`	The data frame for regime 2. Should include all the variables listed in formall.
`bmat1`	Matrix of values for regime 1 quantile coefficient matrices; the output from running qregbmat using dataframe1.
`bmat2`	Matrix of values for regime 2 quantile coefficient matrices; the output from running qregbmat using dataframe2.
`graphx`	If graphx=T, presents kernel density estimates of each of the explanatory variables in formx.
`graphb`	If graphb=T, presents graphs of the quantile coefficient estimates for the variables in formx.
`graphy`	If graphy=T, presents of the predicted values of y for time1, time2, and the counterfactual.
`graphdy`	If graphdy=T, presents graphs of the changes in densities.
`nbarplot`	Specifies the maximum number of values taken by an explanatory variable before bar plots are replaced by smooth kernel density functions. Only relevant when graphx = T.
`yname`	A label used for the dependent variable in the density graphs, e.g., yname = "Log of Sale Price".
`xnames`	Labels for graphs involving the explanatory variables, e.g., xnames = "x1" for one explanatory variable, or xnames = c("x1","x2") for two variables.
`timenames`	A vector with labels for the two regimes. Must be entered as a vector with character values. Default: c("1","2").
`leglocx`	Legend location for density plots of the explanatory variables, e.g., leglocx = "topright" for one explanatory variable, or leglocx = c("topright","topleft") for two variables.
`leglocy`	Legend location for density plots of predicted values of the dependent variable. Default: leglocy = "topright".
`leglocdy`	Legend location for plot of density changes. Default: leglocdy = "topright".
`nsim`	Number of simulations for the decompositions.
`bwadjx`	Factor used to adjust bandwidths for kernel density plots of the explanatory variables. Smoother functions are produced when bwadjust>1. Passed directly to the density function's adjust option. Default: bwadjx=1.
`bwadjy`	Factor used to adjust bandwidths for kernel density plots predicted values of the dependent variable.
`bwadjdy`	Factor used to adjust bandwidths for plots of the kernel density changes.

The base models are y_1 = X_1β_1 + Z_1γ_1 for regime 1 and y_2 = X_2β_2 + Z_2γ_2 for regime 2. The counterfactual model is y_{12} = X_1β_2 + Z_2γ_2. The full list of variable (both X and Z) are provided by form; this list must correspond exactly with the list provided to qregbmat. The subset of variables that are the subject of the decompositions are listed in formx.

The matrices bmat1 and bmat2 are intended to represent the output from qregbmat. The models must include the same set of explanatory variables, and the variables must be in the same order in both bmat1 and bmat2. In contrast, the data frames dataframe1 and dataframe2 can have different numbers of observations and different sets of explanatory, as long as they include the dependent variable and the variables listed in bmat1 and bmat2.

The output from qregsim2 is a series of graphs. If all options are specified, the graphs appear in the following order:

1. Kernel density estimates for each variable listed in formx. Estimated using density with default bandwidths and the specified value for bwadjx. Not shown if graphx=F. The xnames and leglocx options can be used to vary the names used to label the x-axis and the legend location.

2. Quantile coefficient estimates for the variables listed in formx. Not listed if graphb=F.

3. Kernel density estimates for the predicted values of X_1β_1 + Z_1γ_1 and X_2β_2 + Z_2γ_2, and the counterfactual, X_1β_2 + Z_2γ_2. Estimated using density with default bandwidths and the specified value for bwadjy. Not shown if graphy=F. The label for the x-axis can be varied with the yname option. The three estimated density functions are returned after estimation as yhat11, yhat22, and yhat12.

4. A graph showing the change in densities, d2211 = f22 - f11, along with the Machado-Mata decomposition showing:

(a) the change in densities due to the variables listed in formx: d2212 = f22 - f12.

(b) the change in densities due to the coefficients: d1211 = f12 - f11.

These estimates are returned after estimation as d2211, dd2212, and d1211. The density changes are not shown if graphdy=F. The label for the x-axis can be varied with the yname option. The bandwidth for the original density functions f11, f22, and f12 can be varied with bwadjdy. It is generally desirable to set bwadjdy > bwadjy because additional smoothing is needed to make the change in densities appear smooth.

The distributions are simulated by drawing nsim samples with replacement from xobs1 <- seq(1:n1), xobs2 <- seq(1:n2), and bobs <- seq(1:length(taumat)). The commands for the simulations are:

xobs1 <- sample(seq(1:n1),nsim,replace=TRUE)

xobs2 <- sample(seq(1:n2),nsim,replace=TRUE)

bobs <- sample(seq(1:ntau),nsim,replace=TRUE)

xhat1 <- allmat1[xobs1,]

xhat2 <- allmat2[xobs2,]

znames <- setdiff(colnames(allmat1),colnames(xmat1))

if (identical(znames,"(Intercept)")) xhat12 <- xhat1

if (!identical(znames,"(Intercept)"))

xhat12 <- cbind(xhat2[,znames],xhat1[,colnames(xmat1)])

xhat12 <- xhat12[,colnames(allmat1)]

bhat1 <- bmat1[bobs,]

bhat2 <- bmat2[bobs,]

where allmat and xmat denote the matrices defined by explanatory variables listed in formall (including the intercet) and formx. Since the bandwidths are simply the defaults from the density function, they are likely to be different across regimes as the number of observations and the standard deviations may vary across times. Thus, the densities are re-estimated using the average across regimes of the original bandwidths.

`ytarget`	The values for the x-axis for the density functions.
`yhat11`	The kernel density function for X_1 β_1 + Z_1γ_1.
`yhat22`	The kernel density function for X_2 β_2 + Z_2γ_2.
`yhat12`	The kernel density function for X_1 β_2 + Z_2γ_2.
`d2211`	The difference between the density functions for X_2 β_2 + Z_2γ_2 and X_1 β_1 + Z_1γ_1. Will differ from yhat22 - yhat11 if bwadjy and bwadjdy are different.
`d2212`	The difference between the density functions for X_2 β_2 + Z_2γ_2 and X_1 β_2 + Z_2γ_2. Will differ from yhat22 - yhat12 if bwadjy and bwadjdy are different.
`d1211`	The difference between the density functions for X_1 β_2 + Z_2γ_2 and X_1 β_1 + Z_1γ_1. Will differ from yhat12 - yhat11 if bwadjy and bwadjdy are different.

Koenker, Roger. Quantile Regression. New York: Cambridge University Press, 2005.

Machado, J.A.F. and Mata, J., "Counterfactual Decomposition of Changes in Wage Distributions using Quantile Regression," Journal of Applied Econometrics 20 (2005), 445-465.

McMillen, Daniel P., "Changes in the Distribution of House Prices over Time: Structural Characteristics, Neighborhood or Coefficients?" Journal of Urban Economics 64 (2008), 573-589.

dfldens

qregbmat

qregsim1

qregcpar

qreglwr

par(ask=TRUE)

n = 5000
set.seed(484913)
x1 <- rnorm(n,0,1)
u1 <- rnorm(n,0,.5)
y1 <- x1 + u1

# no change in x.  Coefficients show quantile effects
tau <- runif(n,0,.5)
x2 <- x1
y2 <- (1 + (tau-.5))*x2 + .5*qnorm(tau)

dat <- data.frame(rbind(cbind(y1,x1,1), cbind(y2,x2,2)))
names(dat) <- c("y","x","year")
bmat1 <- qregbmat(y~x,data=dat[dat$year==1,],graphb=FALSE)
bmat2 <- qregbmat(y~x,data=dat[dat$year==2,],graphb=FALSE)
fit1 <- qregsim2(y~x,~x,dat[dat$year==1,],dat[dat$year==2,],
  bmat1,bmat2,bwadjdy=2)

# Distribution of x changes.  Coefficients and u stay the same
x2 <- rnorm(n,0,2)
y2 <- x2 + u1
dat <- data.frame(rbind(cbind(y1,x1,1), cbind(y2,x2,2)))
names(dat) <- c("y","x","year")
bmat1 <- qregbmat(y~x,data=dat[dat$year==1,],graphb=FALSE)
bmat2 <- qregbmat(y~x,data=dat[dat$year==2,],graphb=FALSE)
fit1 <- qregsim2(y~x,~x,dat[dat$year==1,],dat[dat$year==2,],
  bmat1,bmat2,bwadjdy=2)