Regression Quantiles for panel data (longitudinal data)

Description

Fit a panel data quantile regression model. The model is specified by using an extended formula syntax (implemented with the Formula package) and by easily configured model options (see Details).

Currently, the available models are (i) the penalized fixed-effects (FE) estimation method proposed by Koenker (2004) and (ii) the correlated-random-effects (CRE) method first proposed by Abrevaya and Dahl (2008) and elaborated on by Bache et al (2011).

The FE estimator is based on minimizing a weighted sum of K ordinary quantile regression objective functions corresponding to a selection of K values of tau, with user specified tau-specific weights. Slope coefficients of this objective function are tau dependent, whereas coefficients corresponding to the fixed effects are assumed to be independent of tau. The vector of fixed-effects coefficients are penalized by an l1 (lasso) penalty term with associated penalty parameter lambda, thereby shrinking these coefficients toward zero.

The CRE estimator do not estimate the fixed effects, but controls for time-invariant dependence between the fixed effects and a set of covariates by linearly including time-invariant CRE transformations of possibly endogenous time-varying variables. The conditional distribution of interest, is thus in some sense unconditional of the fixed effects.

Usage

1
2

Arguments

formula

A method-specific formula for the model employing the conventions of the Formula package, which has the additional operator "|". "rqpd"-formulas are specified as:

FE: y ~ x1 + x2 + ... | s
CRE: y ~ x1 + x2 + ... | s | z1 + z2 + ...

The portion of the formulas before the (first) vertical bar is specified like the conventional lm or rq function formulas.

The factor variable s specifies the structure of the panel, and in the FE model it represents the "fixed effects". This is typically an id column.

The last part of the CRE formula is a specification of the variables in the CRE component. These are possibly endogenous variables (in the sense that they are affected by the fixed effects) and must be time-varying. Note that there are two vertical bars.

panel

This argument specifies the panel model configurations. It is created (or passed directly) by the panel function. It constitutes a named list with options, some of which apply for both FE and CRE models, whereas others are method specific. See details for available options.

data

A data.frame containing the variables as specified in the formula.

na.action

A function to filter missing data. This is applied to the model.frame after any subset argument has been used. The default (with 'na.fail') is to create an error if any missing values are found. A possible alternative is 'na.omit', which deletes observations that contain one or more missing values.

subset

An optional vector specifying a subset of observations to be used in the fitting process.

contrasts

A list giving contrasts for some or all of the factors default = 'NULL' appearing in the model formula. The elements of the list should have the same name as the variable and should be either a contrast matrix (specifically, any full-rank matrix with as many rows as there are levels in the factor), or else a function to compute such a matrix given the number of levels.

control

Control argument for the fitting routines (see 'sfn.control').

...

Other arguments passed to fitting routines.

Details

For details on the FE model, see Koenker (2004). A bare bones version of this code is available from http://www.econ.uiuc.edu/~roger/research/panel/long.html.

The CRE models are all summarized in the paper by Bache et al. (2011).

The panel argument is created with the panel function, e.g.: panel(taus=c(0.1, 0.25, 0.5, 0.75, 0.9), tauw=rep(1/5, 5)). Options not specified will get the default values. It is not recommended to manually specify the list, as the panel function does some argument validation.

The available options are:

method (FE, CRE): Method specificaition. "pfe" for a fixed-
effects estimation, "cre" for correlated-
random-effects estimation. Default is "pfe".
taus (FE, CRE): A vector of quantile indices in (0,1).
Default is 1:3/4.
tauw (FE): A vector of weights (summing to 1) for
the K weighted components in the FE criterion
function. Default is c(.25, .5, .25).
lambda (FE): The penalty parameter controlling the
shrinkage of the fixed effects toward zero.
Default is 1.
cre (CRE): When method="cre", this is used to specify
the nature of the CRE component. For time-
means use "m" or "crem", for a specification
like that in Abrevaya and Dahl (2008) use
"ad". Default is "m", which allows for an
unbalanced panel. "ad" does not.
ztol (FE): A small number used to determine when
numerically small numbers should be
considered to be zero. Default is 1e-5.

Value

The function returns a fitted object representing the estimated model specified in the formula and by the panel argument. See 'rqpd.object' for further details on this object, and references to methods to look at it.

Author(s)

Roger Koenker and Stefan Bache

Maintainer: Stefan Bache <rqpd@stefanbache.dk>

References

[1] Abrevaya, Jason and Christian M. Dahl. 2008. The effects of birth inputs on birthweight. Jounal of Business and Economic Statistics. 26-4. Pages 379–397.

[2] Bache, Stefan Holst; Christian M. Dahl; Johannes Tang Kristensen. 2011. Headlights on tobacco road to low birthweight–Evidence from a battery of quantile regression estimators and a heterogeneous panel.

[3] Koenker, Roger. 2004. Quantile regression for longitudinal data. Journal of Multivariate Analysis. 91-1. Pages 74–89.

See Also

summary.rqpd, rqpd.object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
set.seed(10)
m <- 3
n <- 10
s <- as.factor(rep(1:n,rep(m,n)))
x <- exp(rnorm(n*m))
u <- x*rnorm(m*n) + (1-x)*rf(m*n,3,3)
a <- rep(rnorm(n),rep(m,n))
y <- a + u
fit <- rqpd(y ~ x | s, panel(lambda = 5))
sfit <- summary(fit)

# A CRE model
data(bwd)

cre.form <- dbirwt ~ smoke + dmage + agesq + 
   novisit + pretri2 + pretri3 | momid3 | smoke + 
   dmage + agesq 

# CRE-M type fit:
crem.fit <- rqpd(cre.form, panel(method="cre"), data=bwd)

# AD type fit:
ad.fit <- rqpd(cre.form, panel(method="cre", cre="ad"), data=bwd,
  subset=bwd$idx %in% 1:2)