knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(summclust)
The summclust
package allows to compute leverage statistics for clustered errors and fast CRV3(J)
variance-covariance matrices as described in MacKinnon, J.G., Nielsen, M.Ø., Webb, M.D., 2022. Leverage, influence, and the jackknife in clustered regression models: Reliable inference using summclust.
It is a post-estimation command and currently supports methods for objects of type lm
(from stats
) and fixest
(from the fixest
package).
summclust
functionlibrary(summclust) library(lmtest) library(haven) url <- "http://www.stata-press.com/data/r9/nlswork.dta" if (httr::http_error(url)) { stop("No internet connection or data source broken. Sorry about that!") return(NULL) } else { message("downloading the 'nlswork' dataset.") nlswork <- read_dta(url) } # drop NAs at the moment nlswork <- nlswork[, c("ln_wage", "grade", "age", "birth_yr", "union", "race", "msp", "ind_code")] nlswork <- na.omit(nlswork) lm_fit <- lm( ln_wage ~ as.factor(grade) + as.factor(age) + as.factor(birth_yr) + union + race + msp, data = nlswork) summclust_res <- summclust( obj = lm_fit, cluster = ~ind_code, params = c("msp", "union") ) # CRV3-based inference - exactly matches output of summclust-stata tidy(summclust_res) summary(summclust_res)
To visually inspect the leverage statistics, use the plot
method
plot(summclust_res)
summclust
with coefplot
and fixest
Note that you can also use CVR3 and CRV3J covariance matrices computed via summclust
or its vcov_CR3J
method with the lmtest()
and fixest
packages.
library(lmtest) vcov3J <- vcov_CR3J( lm_fit, cluster = ~ ind_code ) all.equal( vcov3J, summclust_res$vcov ) df <- length(summclust_res$cluster) - 1 # with lmtest CRV1 <- lmtest::coeftest(lm_fit, sandwich::vcovCL(lm_fit, ~ind_code), df = df) CRV3 <- lmtest::coeftest(lm_fit, vcov3J, df = df) CRV1[c("union", "race", "msp"),] CRV3[c("union", "race", "msp"),] stats::confint(CRV1)[c("union", "race", "msp"),] stats::confint(CRV3)[c("union", "race", "msp"),]
library(fixest) feols_fit <- feols( ln_wage ~ i(grade) + i(age) + i(birth_yr) + union + race + msp, data = nlswork ) # Store vcov into the fixest object feols_fit <- summary( feols_fit, vcov = vcov_CR3J(feols_fit, cluster = ~ ind_code) ) # Now it just works with fixest functions fixest::coeftable(feols_fit, keep = c("msp", "union", "race"))
The p-value and confidence intervals for fixest::coeftable()
differ from lmtest::coeftest()
and summclust::coeftable()
. This is due to the fact that fixest::coeftable()
uses a different degree of freedom for the t-distribution used in these calculation (I believe it uses t(N-1)).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.