plm.fast: Option to Switch On/Off Fast Data Transformations

plm.fastR Documentation

Option to Switch On/Off Fast Data Transformations

Description

A significant speed up can be gained by using fast (panel) data transformation functions from package collapse. An additional significant speed up for the two-way fixed effects case can be achieved if package fixest or lfe is installed (package collapse needs to be installed for the fast mode in any case).

Details

By default, this speed up is enabled. Option plm.fast can be used to enable/disable the speed up. The option is evaluated prior to execution of supported transformations (see below), so option("plm.fast" = TRUE) enables the speed up while option("plm.fast" = FALSE) disables the speed up.

To have it always switched off, put options("plm.fast" = FALSE) in your .Rprofile file.

See Examples for how to use the option and for a benchmarking example.

For long, package plm used base R implementations and R-based code. The package collapse provides fast data transformation functions written in C/C++, among them some especially suitable for panel data. Having package collapse installed is a requirement for the speed up, so this package is a hard dependency for package plm.

Availability of packages fixest and lfe is checked for once when package plm is attached and the additional speed up for the two-way fixed effect case is enabled automatically (fixest wins over lfe), given one of the packages is detected and options("plm.fast" = TRUE) (default) is set. If so, the packages' fast algorithms to partial out fixed effects are used (fixest::demean (via collapse::fhdwithin), lfe::demeanlist). Both packages are 'Suggests' dependencies.

Users might experience neglectable numerical differences between enabled and disabled fast mode and base R implementation, depending on the platform and the additional packages installed.

Currently, these basic functions benefit from the speed-up, used as building blocks in most model estimation functions, e.g., in plm (more functions are under investigation):

  • between,

  • Between,

  • Sum,

  • Within,

  • lag, lead, and diff,

  • pseriesfy,

  • pdiff (internal function).

Examples

## Not run: 
### A benchmark of plm without and with speed-up
library("plm")
library("collapse")
library("microbenchmark")
rm(list = ls())
data("wlddev", package = "collapse")
form <- LIFEEX ~ PCGDP + GINI

# produce big data set (taken from collapse's vignette)
wlddevsmall <- get_vars(wlddev, c("iso3c","year","OECD","PCGDP","LIFEEX","GINI","ODA"))
wlddevsmall$iso3c <- as.character(wlddevsmall$iso3c)
data <- replicate(100, wlddevsmall, simplify = FALSE)
rm(wlddevsmall)
uniquify <- function(x, i) {
  x$iso3c <- paste0(x$iso3c, i)
  x
}
data <- unlist2d(Map(uniquify, data, as.list(1:100)), idcols = FALSE)
data <- pdata.frame(data, index = c("iso3c", "year"))
pdim(data) # Balanced Panel: n = 21600, T = 59, N = 1274400 // but many NAs
# data <- na.omit(data)
# pdim(data) # Unbalanced Panel: n = 13300, T = 1-31, N = 93900

times <- 1 # no. of repetitions for benchmark - this takes quite long!

onewayFE <- microbenchmark(
 {options("plm.fast" = FALSE); plm(form, data = data, model = "within")},
 {options("plm.fast" = TRUE);  plm(form, data = data, model = "within")},
  times = times)

summary(onewayFE, unit = "relative")

## two-ways FE benchmark requires pkg fixest and lfe
## (End-users shall only set option plm.fast. Option plm.fast.pkg.FE.tw shall
##  _not_ be set by the end-user, it is determined automatically when pkg plm
## is attached; however, it needs to be set explicitly in this example for the
## benchmark.)
if(requireNamespace("fixest", quietly = TRUE) &&
   requireNamespace("lfe", quietly = TRUE)) {

twowayFE <-  microbenchmark(
 {options("plm.fast" = FALSE);
    plm(form, data = data, model = "within", effect = "twoways")},
 {options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "collapse");
    plm(form, data = data, model = "within", effect = "twoways")},
 {options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "fixest");
    plm(form, data = data, model = "within", effect = "twoways")},
 {options("plm.fast" = TRUE, "plm.fast.pkg.FE.tw" = "lfe");
    plm(form, data = data, model = "within", effect = "twoways")},
  times = times)

summary(twowayFE, unit = "relative")
}

onewayRE <- microbenchmark(
 {options("plm.fast" = FALSE); plm(form, data = data, model = "random")},
 {options("plm.fast" = TRUE);  plm(form, data = data, model = "random")},
  times = times)

summary(onewayRE, unit = "relative")

twowayRE <-  microbenchmark(
 {options("plm.fast" = FALSE); plm(form, data = data, model = "random", effect = "twoways")},
 {options("plm.fast" = TRUE);  plm(form, data = data, model = "random", effect = "twoways")},
  times = times)

summary(twowayRE, unit = "relative")

## End(Not run)

ycroissant/plm documentation built on July 8, 2024, 3:59 a.m.