Using summclust

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(summclust)

Introduction

The summclust package allows to compute leverage statistics for clustered errors and fast CRV3(J) variance-covariance matrices as described in MacKinnon, J.G., Nielsen, M.Ø., Webb, M.D., 2022. Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust.

It is a post-estimation command and currently supports methods for objects of type lm (from stats) and fixest (from the fixest package).

The summclust function

library(summclust)
library(lmtest)
library(haven)

url <- "http://www.stata-press.com/data/r9/nlswork.dta"
if (httr::http_error(url)) {
  stop("No internet connection or data source broken. Sorry about that!")
  return(NULL)
} else {
  message("downloading the 'nlswork' dataset.")
  nlswork <- read_dta(url)
}



# drop NAs at the moment
nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")]
nlswork <- na.omit(nlswork)

lm_fit <- lm(
  ln_wage ~ as.factor(grade) + as.factor(age) + as.factor(birth_yr) + union +  race + msp,
  data = nlswork)

summclust_res <- summclust(
  obj = lm_fit,
  cluster = ~ind_code,
  params = c("msp", "union")
)

# CRV3-based inference - exactly matches output of summclust-stata
tidy(summclust_res)

summary(summclust_res)

To visually inspect the leverage statistics, use the plot method

plot(summclust_res)

Using summclust with coefplot and fixest

Note that you can also use CVR3 and CRV3J covariance matrices computed via summclust or its vcov_CR3J method with the lmtest() and fixest packages.

library(lmtest)

vcov3J <- 
  vcov_CR3J(
    lm_fit, 
    cluster = ~ ind_code
  )

all.equal(
  vcov3J, 
  summclust_res$vcov
)

df <- length(summclust_res$cluster) - 1

# with lmtest
CRV1 <- lmtest::coeftest(lm_fit, sandwich::vcovCL(lm_fit, ~ind_code), df = df)
CRV3 <- lmtest::coeftest(lm_fit, vcov3J, df = df)

CRV1[c("union", "race", "msp"),]
CRV3[c("union", "race", "msp"),]

stats::confint(CRV1)[c("union", "race", "msp"),]
stats::confint(CRV3)[c("union", "race", "msp"),]
library(fixest)

feols_fit <- feols(
  ln_wage ~ i(grade) + i(age) + i(birth_yr) + union +  race + msp,
  data = nlswork
)

# Store vcov into the fixest object
feols_fit <- summary(
  feols_fit, 
  vcov = vcov_CR3J(feols_fit, cluster = ~ ind_code)
)

# Now it just works with fixest functions
fixest::coeftable(feols_fit, keep = c("msp", "union", "race"))

The p-value and confidence intervals for fixest::coeftable() differ from lmtest::coeftest() and summclust::coeftable(). This is due to the fact that fixest::coeftable() uses a different degree of freedom for the t-distribution used in these calculation (I believe it uses t(N-1)).



Try the summclust package in your browser

Any scripts or data that you put into this service are public.

summclust documentation built on Aug. 10, 2023, 9:07 a.m.