hdfeppml: PPML Estimation with HDFE

View source: R/wrappers.R

hdfeppmlR Documentation

PPML Estimation with HDFE

Description

hdfeppml fits an (unpenalized) Poisson Pseudo Maximum Likelihood (PPML) model with high-dimensional fixed effects (HDFE).

Usage

hdfeppml(
  data,
  dep = 1,
  indep = NULL,
  fixed = NULL,
  cluster = NULL,
  selectobs = NULL,
  ...
)

Arguments

data

A data frame containing all relevant variables.

dep

A string with the name of the independent variable or a column number.

indep

A vector with the names or column numbers of the regressors. If left unspecified, all remaining variables (excluding fixed effects) are included in the regressor matrix.

fixed

A vector with the names or column numbers of factor variables identifying the fixed effects, or a list with the desired interactions between variables in data.

cluster

Optional. A string with the name of the clustering variable or a column number. It's also possible to input a vector with several variables, in which case the interaction of all of them is taken as the clustering variable.

selectobs

Optional. A vector indicating which observations to use (either a logical vector or a numeric vector with row numbers, as usual when subsetting in R).

...

Further options. For a full list, see hdfeppml_int.

Details

This function is a thin wrapper around hdfeppml_int, providing a more convenient interface for data frames. Whereas the internal function requires some preliminary handling of data sets (y must be a vector, x must be a matrix and fixed effects fes must be provided in a list), the wrapper takes a full data frame in the data argument, and users can simply specify which variables correspond to y, x and the fixed effects, using either variable names or column numbers.

More formally, hdfeppml_int performs iteratively re-weighted least squares (IRLS) on a transformed model, as described in Correia, GuimarĂ£es and Zylkin (2020) and similar to the ppmlhdfe package in Stata. In each iteration, the function calculates the transformed dependent variable, partials out the fixed effects (calling collapse:fhdwithin) and then solves a weighted least squares problem (using fast C++ implementation).

Value

A list with the following elements:

  • coefficients: a 1 x ncol(x) matrix with coefficient (beta) estimates.

  • residuals: a 1 x length(y) matrix with the residuals of the model.

  • mu: a 1 x length(y) matrix with the final values of the conditional mean \mu.

  • deviance:

  • bic: Bayesian Information Criterion.

  • x_resid: matrix of demeaned regressors.

  • z_resid: vector of demeaned (transformed) dependent variable.

  • se: standard errors of the coefficients.

References

Breinlich, H., Corradi, V., Rocha, N., Ruta, M., Santos Silva, J.M.C. and T. Zylkin (2021). "Machine Learning in International Trade Research: Evaluating the Impact of Trade Agreements", Policy Research Working Paper; No. 9629. World Bank, Washington, DC.

Correia, S., P. Guimaraes and T. Zylkin (2020). "Fast Poisson estimation with high dimensional fixed effects", STATA Journal, 20, 90-115.

Gaure, S (2013). "OLS with multiple high dimensional category variables", Computational Statistics & Data Analysis, 66, 8-18.

Friedman, J., T. Hastie, and R. Tibshirani (2010). "Regularization paths for generalized linear models via coordinate descent", Journal of Statistical Software, 33, 1-22.

Belloni, A., V. Chernozhukov, C. Hansen and D. Kozbur (2016). "Inference in high dimensional panel models with an application to gun control", Journal of Business & Economic Statistics, 34, 590-605.

Examples

## Not run: 
# To reduce run time, we keep only countries in the Americas:
americas <- countries$iso[countries$region == "Americas"]
test <- hdfeppml(data = trade[, -(5:6)],
                   dep = "export",
                   fixed = list(c("exp", "time"),
                                c("imp", "time"),
                                c("exp", "imp")),
                   selectobs = (trade$imp %in% americas) & (trade$exp %in% americas))

## End(Not run)


penppml documentation built on Sept. 8, 2023, 5:58 p.m.