qregsim2: Machado-Mata Decomposition of Changes in Distributions

Description Usage Arguments Details Value References See Also Examples

View source: R/qregsim2.R

Description

Decomposes quantile regression estimates of changes in the distribution of a dependent variable into the components associated with changes in the distribution of the explanatory variables and the coefficient estimates.

Usage

1
2
3
4
5
6
7
 
qregsim2(formall, formx, dataframe1, dataframe2, bmat1, bmat2,  
  graphx=TRUE, graphb=TRUE, graphy=TRUE, graphdy=TRUE, nbarplot=10, 
  yname=NULL, xnames=NULL, timenames=c("1","2"),
  leglocx="topright",leglocy="topright",leglocdy="topright",
  nsim=20000, bwadjx=1,bwadjy=1,bwadjdy=1)  
 

Arguments

formall

Model formula. Must match the model formula used for qregbmat.

formx

Model formula for the variables used for the decompositions, e.g., formx=~x1+x2. The coefficients and variables for the other variables are held at their time 2 values for the simulations.

dataframe1

The data frame for regime 1. Should include all the variables listed in formall.

dataframe2

The data frame for regime 2. Should include all the variables listed in formall.

bmat1

Matrix of values for regime 1 quantile coefficient matrices; the output from running qregbmat using dataframe1.

bmat2

Matrix of values for regime 2 quantile coefficient matrices; the output from running qregbmat using dataframe2.

graphx

If graphx=T, presents kernel density estimates of each of the explanatory variables in formx.

graphb

If graphb=T, presents graphs of the quantile coefficient estimates for the variables in formx.

graphy

If graphy=T, presents of the predicted values of y for time1, time2, and the counterfactual.

graphdy

If graphdy=T, presents graphs of the changes in densities.

nbarplot

Specifies the maximum number of values taken by an explanatory variable before bar plots are replaced by smooth kernel density functions. Only relevant when graphx = T.

yname

A label used for the dependent variable in the density graphs, e.g., yname = "Log of Sale Price".

xnames

Labels for graphs involving the explanatory variables, e.g., xnames = "x1" for one explanatory variable, or xnames = c("x1","x2") for two variables.

timenames

A vector with labels for the two regimes. Must be entered as a vector with character values. Default: c("1","2").

leglocx

Legend location for density plots of the explanatory variables, e.g., leglocx = "topright" for one explanatory variable, or leglocx = c("topright","topleft") for two variables.

leglocy

Legend location for density plots of predicted values of the dependent variable. Default: leglocy = "topright".

leglocdy

Legend location for plot of density changes. Default: leglocdy = "topright".

nsim

Number of simulations for the decompositions.

bwadjx

Factor used to adjust bandwidths for kernel density plots of the explanatory variables. Smoother functions are produced when bwadjust>1. Passed directly to the density function's adjust option. Default: bwadjx=1.

bwadjy

Factor used to adjust bandwidths for kernel density plots predicted values of the dependent variable.

bwadjdy

Factor used to adjust bandwidths for plots of the kernel density changes.

Details

The base models are y_1 = X_1β_1 + Z_1γ_1 for regime 1 and y_2 = X_2β_2 + Z_2γ_2 for regime 2. The counterfactual model is y_{12} = X_1β_2 + Z_2γ_2. The full list of variable (both X and Z) are provided by form; this list must correspond exactly with the list provided to qregbmat. The subset of variables that are the subject of the decompositions are listed in formx.

The matrices bmat1 and bmat2 are intended to represent the output from qregbmat. The models must include the same set of explanatory variables, and the variables must be in the same order in both bmat1 and bmat2. In contrast, the data frames dataframe1 and dataframe2 can have different numbers of observations and different sets of explanatory, as long as they include the dependent variable and the variables listed in bmat1 and bmat2.

The output from qregsim2 is a series of graphs. If all options are specified, the graphs appear in the following order:

1. Kernel density estimates for each variable listed in formx. Estimated using density with default bandwidths and the specified value for bwadjx. Not shown if graphx=F. The xnames and leglocx options can be used to vary the names used to label the x-axis and the legend location.

2. Quantile coefficient estimates for the variables listed in formx. Not listed if graphb=F.

3. Kernel density estimates for the predicted values of X_1β_1 + Z_1γ_1 and X_2β_2 + Z_2γ_2, and the counterfactual, X_1β_2 + Z_2γ_2. Estimated using density with default bandwidths and the specified value for bwadjy. Not shown if graphy=F. The label for the x-axis can be varied with the yname option. The three estimated density functions are returned after estimation as yhat11, yhat22, and yhat12.

4. A graph showing the change in densities, d2211 = f22 - f11, along with the Machado-Mata decomposition showing:

(a) the change in densities due to the variables listed in formx: d2212 = f22 - f12.

(b) the change in densities due to the coefficients: d1211 = f12 - f11.

These estimates are returned after estimation as d2211, dd2212, and d1211. The density changes are not shown if graphdy=F. The label for the x-axis can be varied with the yname option. The bandwidth for the original density functions f11, f22, and f12 can be varied with bwadjdy. It is generally desirable to set bwadjdy > bwadjy because additional smoothing is needed to make the change in densities appear smooth.

The distributions are simulated by drawing nsim samples with replacement from xobs1 <- seq(1:n1), xobs2 <- seq(1:n2), and bobs <- seq(1:length(taumat)). The commands for the simulations are:

xobs1 <- sample(seq(1:n1),nsim,replace=TRUE)

xobs2 <- sample(seq(1:n2),nsim,replace=TRUE)

bobs <- sample(seq(1:ntau),nsim,replace=TRUE)

xhat1 <- allmat1[xobs1,]

xhat2 <- allmat2[xobs2,]

znames <- setdiff(colnames(allmat1),colnames(xmat1))

if (identical(znames,"(Intercept)")) xhat12 <- xhat1

if (!identical(znames,"(Intercept)"))

xhat12 <- cbind(xhat2[,znames],xhat1[,colnames(xmat1)])

xhat12 <- xhat12[,colnames(allmat1)]

bhat1 <- bmat1[bobs,]

bhat2 <- bmat2[bobs,]

where allmat and xmat denote the matrices defined by explanatory variables listed in formall (including the intercet) and formx. Since the bandwidths are simply the defaults from the density function, they are likely to be different across regimes as the number of observations and the standard deviations may vary across times. Thus, the densities are re-estimated using the average across regimes of the original bandwidths.

Value

ytarget

The values for the x-axis for the density functions.

yhat11

The kernel density function for X_1 β_1 + Z_1γ_1.

yhat22

The kernel density function for X_2 β_2 + Z_2γ_2.

yhat12

The kernel density function for X_1 β_2 + Z_2γ_2.

d2211

The difference between the density functions for X_2 β_2 + Z_2γ_2 and X_1 β_1 + Z_1γ_1. Will differ from yhat22 - yhat11 if bwadjy and bwadjdy are different.

d2212

The difference between the density functions for X_2 β_2 + Z_2γ_2 and X_1 β_2 + Z_2γ_2. Will differ from yhat22 - yhat12 if bwadjy and bwadjdy are different.

d1211

The difference between the density functions for X_1 β_2 + Z_2γ_2 and X_1 β_1 + Z_1γ_1. Will differ from yhat12 - yhat11 if bwadjy and bwadjdy are different.

References

Koenker, Roger. Quantile Regression. New York: Cambridge University Press, 2005.

Machado, J.A.F. and Mata, J., "Counterfactual Decomposition of Changes in Wage Distributions using Quantile Regression," Journal of Applied Econometrics 20 (2005), 445-465.

McMillen, Daniel P., "Changes in the Distribution of House Prices over Time: Structural Characteristics, Neighborhood or Coefficients?" Journal of Urban Economics 64 (2008), 573-589.

See Also

dfldens

qregbmat

qregsim1

qregcpar

qreglwr

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
par(ask=TRUE)

n = 5000
set.seed(484913)
x1 <- rnorm(n,0,1)
u1 <- rnorm(n,0,.5)
y1 <- x1 + u1

# no change in x.  Coefficients show quantile effects
tau <- runif(n,0,.5)
x2 <- x1
y2 <- (1 + (tau-.5))*x2 + .5*qnorm(tau)

dat <- data.frame(rbind(cbind(y1,x1,1), cbind(y2,x2,2)))
names(dat) <- c("y","x","year")
bmat1 <- qregbmat(y~x,data=dat[dat$year==1,],graphb=FALSE)
bmat2 <- qregbmat(y~x,data=dat[dat$year==2,],graphb=FALSE)
fit1 <- qregsim2(y~x,~x,dat[dat$year==1,],dat[dat$year==2,],
  bmat1,bmat2,bwadjdy=2)

# Distribution of x changes.  Coefficients and u stay the same
x2 <- rnorm(n,0,2)
y2 <- x2 + u1
dat <- data.frame(rbind(cbind(y1,x1,1), cbind(y2,x2,2)))
names(dat) <- c("y","x","year")
bmat1 <- qregbmat(y~x,data=dat[dat$year==1,],graphb=FALSE)
bmat2 <- qregbmat(y~x,data=dat[dat$year==2,],graphb=FALSE)
fit1 <- qregsim2(y~x,~x,dat[dat$year==1,],dat[dat$year==2,],
  bmat1,bmat2,bwadjdy=2)

McSpatial documentation built on May 2, 2019, 9:32 a.m.