penppml-package: penppml: Penalized Poisson Pseudo Maximum Likelihood...

penppml-packageR Documentation

penppml: Penalized Poisson Pseudo Maximum Likelihood Regression

Description

A set of tools that enables efficient estimation of penalized Poisson Pseudo Maximum Likelihood regressions, using lasso or ridge penalties, for models that feature one or more sets of high-dimensional fixed effects. The methodology is based on Breinlich, Corradi, Rocha, Ruta, Santos Silva, and Zylkin (2021) http://hdl.handle.net/10986/35451 and takes advantage of the method of alternating projections of Gaure (2013) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.csda.2013.03.024")} for dealing with HDFE, as well as the coordinate descent algorithm of Friedman, Hastie and Tibshirani (2010) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i01")} for fitting lasso regressions. The package is also able to carry out cross-validation and to implement the plugin lasso of Belloni, Chernozhukov, Hansen and Kozbur (2016) \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/07350015.2015.1102733")}.

Functions

The workhorse of this package is the mlfitppml function, which allows users to carry out penalized HDFE-PPML estimation with a wide variety of options. The syntax is very simple, allowing users to select a data frame with all the relevant variables and then select dependent, independent and fixed effects variables by name or column number.

In addition, the internals hdfeppml (post-lasso regression), penhdfeppml (penalized regression for a single lambda), penhdfeppml_cluster (plugin lasso), and xvalidate (cross- validation) are made available on a stand-alone basis for advanced users.

The package also includes alternative versions of mlfitppml, hdfeppml, penhdfeppml and penhdfeppml_cluster. These (mlfitppml_int, hdfeppml_int, penhdfeppml_int and penhdfeppml_cluster_int) use an alternative syntax: users must provide the dependent variable in a vector, the regressors in a matrix and the fixed effects in a list.

Finally, support for the iceberg lasso method in Breinlich, Corradi, Rocha, Ruta, Santos Silva, and Zylkin (2021) is in development and can be accessed at its current stage via the iceberg function.

References

Breinlich, H., Corradi, V., Rocha, N., Ruta, M., Santos Silva, J.M.C. and T. Zylkin (2021). "Machine Learning in International Trade Research: Evaluating the Impact of Trade Agreements", Policy Research Working Paper; No. 9629. World Bank, Washington, DC.

Correia, S., P. Guimaraes and T. Zylkin (2020). "Fast Poisson estimation with high dimensional fixed effects", STATA Journal, 20, 90-115.

Gaure, S (2013). "OLS with multiple high dimensional category variables", Computational Statistics & Data Analysis, 66, 8-18.

Friedman, J., T. Hastie, and R. Tibshirani (2010). "Regularization paths for generalized linear models via coordinate descent", Journal of Statistical Software, 33, 1-22.

Belloni, A., V. Chernozhukov, C. Hansen and D. Kozbur (2016). "Inference in high dimensional panel models with an application to gun control", Journal of Business & Economic Statistics, 34, 590-605.

Author(s)

Maintainer: Joao Cruz jm01780@surrey.ac.uk

Authors:

Other contributors:

See Also

Useful links:


penppml documentation built on Sept. 8, 2023, 5:58 p.m.