hdfeppml_int: PPML Estimation with HDFE

View source: R/hdfeppml_int.R

hdfeppml_intR Documentation

PPML Estimation with HDFE

Description

hdfeppml_int is the internal algorithm called by hdfeppml to fit an (unpenalized) Poisson Pseudo Maximum Likelihood (PPML) regression with high-dimensional fixed effects (HDFE). It takes a vector with the dependent variable, a regressor matrix and a set of fixed effects (in list form: each element in the list should be a separate HDFE).

Usage

hdfeppml_int(
  y,
  x = NULL,
  fes = NULL,
  tol = 1e-08,
  hdfetol = 1e-04,
  mu = NULL,
  saveX = TRUE,
  colcheck = TRUE,
  colcheck_x = colcheck,
  colcheck_x_fes = colcheck,
  init_z = NULL,
  verbose = FALSE,
  maxiter = 1000,
  cluster = NULL,
  vcv = TRUE
)

Arguments

y

Dependent variable (a vector)

x

Regressor matrix.

fes

List of fixed effects.

tol

Tolerance parameter for convergence of the IRLS algorithm.

hdfetol

Tolerance parameter for the within-transformation step, passed on to collapse::fhdwithin.

mu

A vector of initial values for mu that can be passed to the command.

saveX

Logical. If TRUE, it returns the values of x and z after partialling out the fixed effects.

colcheck

Logical. If TRUE, performs both checks in colcheck_x and colcheck_x_fes. If the user specifies colcheck_x and colcheck_x_fes individually, this option is overwritten.

colcheck_x

Logical. If TRUE, this checks collinearity between the independent variables and drops the collinear variables.

colcheck_x_fes

Logical. If TRUE, this checks whether the independent variables are perfectly explained by the fixed effects drops those that are perfectly explained.

init_z

Optional: initial values of the transformed dependent variable, to be used in the first iteration of the algorithm.

verbose

Logical. If TRUE, it prints information to the screen while evaluating.

maxiter

Maximum number of iterations (a number).

cluster

Optional: a vector classifying observations into clusters (to use when calculating SEs).

vcv

Logical. If TRUE (the default), it returns standard errors.

Details

More formally, hdfeppml_int performs iteratively re-weighted least squares (IRLS) on a transformed model, as described in Correia, GuimarĂ£es and Zylkin (2020) and similar to the ppmlhdfe package in Stata. In each iteration, the function calculates the transformed dependent variable, partials out the fixed effects (calling collapse::fhdwithin, which uses the algorithm in Gaure (2013)) and then solves a weighted least squares problem (using fast C++ implementation).

Value

A list with the following elements:

  • coefficients: a 1 x ncol(x) matrix with coefficient (beta) estimates.

  • residuals: a 1 x length(y) matrix with the residuals of the model.

  • mu: a 1 x length(y) matrix with the final values of the conditional mean \mu.

  • deviance:

  • bic: Bayesian Information Criterion.

  • x_resid: matrix of demeaned regressors.

  • z_resid: vector of demeaned (transformed) dependent variable.

  • se: standard errors of the coefficients.

References

Breinlich, H., Corradi, V., Rocha, N., Ruta, M., Santos Silva, J.M.C. and T. Zylkin (2021). "Machine Learning in International Trade Research: Evaluating the Impact of Trade Agreements", Policy Research Working Paper; No. 9629. World Bank, Washington, DC.

Correia, S., P. Guimaraes and T. Zylkin (2020). "Fast Poisson estimation with high dimensional fixed effects", STATA Journal, 20, 90-115.

Gaure, S (2013). "OLS with multiple high dimensional category variables", Computational Statistics & Data Analysis, 66, 8-18.

Friedman, J., T. Hastie, and R. Tibshirani (2010). "Regularization paths for generalized linear models via coordinate descent", Journal of Statistical Software, 33, 1-22.

Belloni, A., V. Chernozhukov, C. Hansen and D. Kozbur (2016). "Inference in high dimensional panel models with an application to gun control", Journal of Business & Economic Statistics, 34, 590-605.

Examples

## Not run: 
# To reduce run time, we keep only countries in the Americas:
americas <- countries$iso[countries$region == "Americas"]
trade <- trade[(trade$imp %in% americas) & (trade$exp %in% americas), ]
# Now generate the needed x, y and fes objects:
y <- trade$export
x <- data.matrix(trade[, -1:-6])
fes <- list(exp_time = interaction(trade$exp, trade$time),
            imp_time = interaction(trade$imp, trade$time),
            pair     = interaction(trade$exp, trade$imp))
# Finally, the call to hdfeppml_int:
reg <- hdfeppml_int(y = y, x = x, fes = fes)

## End(Not run)


penppml documentation built on Sept. 8, 2023, 5:58 p.m.