penhdfeppml_int: One-Shot Penalized PPML Estimation with HDFE

Description Usage Arguments Details Value References Examples

View source: R/penhdfeppml_int.R

Description

penhdfeppml_int is the internal algorithm called by penhdfeppml to fit a penalized PPML regression for a given type of penalty and a given value of the penalty parameter. It takes a vector with the dependent variable, a regressor matrix and a set of fixed effects (in list form: each element in the list should be a separate HDFE). The penalty can be either lasso or ridge, and the plugin method can be enabled via the method argument.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
penhdfeppml_int(
  y,
  x,
  fes,
  lambda,
  tol = 1e-08,
  hdfetol = 1e-04,
  glmnettol = 1e-12,
  penalty = "lasso",
  penweights = NULL,
  saveX = TRUE,
  mu = NULL,
  colcheck = TRUE,
  init_z = NULL,
  post = FALSE,
  verbose = FALSE,
  standardize = TRUE,
  method = "placeholder",
  cluster = NULL,
  debug = FALSE
)

Arguments

y

Dependent variable (a vector)

x

Regressor matrix.

fes

List of fixed effects.

lambda

Penalty parameter (a number).

tol

Tolerance parameter for convergence of the IRLS algorithm.

hdfetol

Tolerance parameter for the within-transformation step, passed on to lfe::demeanlist.

glmnettol

Tolerance parameter to be passed on to glmnet::glmnet.

penalty

A string indicating the penalty type. Currently supported: "lasso" and "ridge".

penweights

Optional: a vector of coefficient-specific penalties to use in plugin lasso when method == "plugin".

saveX

Logical. If TRUE, it returns the values of x and z after partialling out the fixed effects.

mu

Optional: initial values of the conditional mean μ, to be used as weights in the first iteration of the algorithm.

colcheck

Logical. If TRUE, checks for perfect multicollinearity in x.

init_z

Optional: initial values of the transformed dependent variable, to be used in the first iteration of the algorithm.

post

Logical. If TRUE, estimates a post-penalty regression with the selected variables.

verbose

Logical. If TRUE, it prints information to the screen while evaluating.

standardize

Logical. If TRUE, x variables are standardized before estimation.

method

The user can set this equal to "plugin" to perform the plugin algorithm with coefficient-specific penalty weights (see details). Otherwise, a single global penalty is used.

cluster

Optional: a vector classifying observations into clusters (to use when calculating SEs).

debug

Logical. If TRUE, this helps with debugging penalty weights by printing output of the first iteration to the console and stopping the estimation algorithm.

Details

More formally, penhdfeppml_int performs iteratively re-weighted least squares (IRLS) on a transformed model, as described in Breinlich, Corradi, Rocha, Ruta, Santos Silva and Zylkin (2020). In each iteration, the function calculates the transformed dependent variable, partials out the fixed effects (calling lfe::demeanlist) and then and then calls glmnet::glmnet if the selected penalty is lasso (the default). If the user selects ridge, the analytical solution is instead computed directly using fast C++ implementation.

For information on the plugin lasso method, see penhdfeppml_cluster_int.

Value

If method == "lasso" (the default), an object of class elnet with the elements described in glmnet, as well as:

If method == "ridge", a list with the following elements:

References

Breinlich, H., Corradi, V., Rocha, N., Ruta, M., Santos Silva, J.M.C. and T. Zylkin (2021). "Machine Learning in International Trade Research: Evaluating the Impact of Trade Agreements", Policy Research Working Paper; No. 9629. World Bank, Washington, DC.

Correia, S., P. Guimaraes and T. Zylkin (2020). "Fast Poisson estimation with high dimensional fixed effects", STATA Journal, 20, 90-115.

Gaure, S (2013). "OLS with multiple high dimensional category variables", Computational Statistics & Data Analysis, 66, 8-18.

Friedman, J., T. Hastie, and R. Tibshirani (2010). "Regularization paths for generalized linear models via coordinate descent", Journal of Statistical Software, 33, 1-22.

Belloni, A., V. Chernozhukov, C. Hansen and D. Kozbur (2016). "Inference in high dimensional panel models with an application to gun control", Journal of Business & Economic Statistics, 34, 590-605.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# To reduce run time, we keep only countries in the Americas:
americas <- countries$iso[countries$region == "Americas"]
trade <- trade[(trade$imp %in% americas) & (trade$exp %in% americas), ]
# Now generate the needed x, y and fes objects:
y <- trade$export
x <- data.matrix(trade[, -1:-6])
fes <- list(exp_time = interaction(trade$exp, trade$time),
            imp_time = interaction(trade$imp, trade$time),
            pair     = interaction(trade$exp, trade$imp))
# Finally, we try penhdfeppml_int with a lasso penalty (the default):
reg <- penhdfeppml_int(y = y, x = x, fes = fes, lambda = 0.1)

# We can also try ridge:
reg <- penhdfeppml_int(y = y, x = x, fes = fes, lambda = 0.1, penalty = "ridge")

penppml documentation built on Sept. 9, 2021, 9:09 a.m.