In eliascis/clubTamal: Clustered covariance matrix for panel data estimations

The package clubTamal creates clustered covariance matrices for first difference panel data models estimated by plm. Cluster groups can be defined at higher levels than the individual observations. E.g. for estimations on yearly houshold data it is possible to cluster at the state level. Package runs for First Difference estimates (model="fd").

Note: The option for Fixed Effects estimates (model="within") is outdated as the package fle with the felm command does a much better job.

Clustering

N - Number of observations

M - Number of cluster

K - Number of regressors (rank)

Clustering First Difference (FD) estimations

$$(X'X)^{-1} \, \sum_{j=1}^M u_{M}'u_{M} \, (X'X)^{-1} \cdot \frac{M}{M-1} \frac{N-1}{N-K}$$ where $u = X_{j} e_{j}$.

Clustering Fixed Effects (FE) estimations

$$\frac{(N-K)}{(N-K)-(M-1)} \cdot (X'X)^{-1} \, \sum_{j=1}^M u_{M}'u_{M} \, (X'X)^{-1} \cdot \frac{M}{M-1} \frac{N-1}{N-K}$$ where $u = X_{j} e_{j}$.

Installation

library(devtools)
install_github(eliascis/clubTamal)
library(clubTamal)
citation(clubTamal)

Note

The function uses the estimate argument and the data argument to construct a transformed data frame - first differenced or time-demeaned. The estimate is then re-estimated by OLS with lm in order to extract the covariance matrix and correct for degrees of freedom and use a sandwich estimate with the group structure.

Examples

##packages
library(foreign)
library(clubTamal)
library(lmtest)
library(plm)
library(clubSandwich)
library(spd4testing)

##data
d<-spd4testing()
d

#manipulate data
# d<-pdata.frame(d)

##formula
f<- formula(y ~ x + factor(year))

##standard estimation
e<-plm(formula=f,data=d,model="fd")
summary(e)
e<-plm(formula=f,data=d,model="within")
summary(e)

##clustering
#no clustering
v<-e$vcov
coeftest(e)
#clustering at id level with plm-package
v<-vcovHC(e,type="HC1",cluster="group",tol=1*10^-20 )
coeftest(e,v)

##clustering at group level with clubSandwich-package
v<-vcovCR(e,d$gid,type="CR1")
coef_test(e,vcov=v,test="naive-t")
coeftest(e,v)

##clustering at group level with clubTamal
v<-vcovTamal(estimate=e,data=d,groupvar="gid")
coeftest(e,v)


##clustering at group level with STATA (TM)
#export data
tdir<-tempdir()
write.csv(d,file.path(tdir,"spd.csv"))
#write stata file
sink(file.path(tdir,"clusterStata.do"))
cat(paste0('insheet using "',file.path("spd.csv"),'", clear'))
cat("\n")
cat('xtset id year')
cat("\n")
cat('xtreg y x i.year, fe')
cat("\n")
cat('xtreg y x i.year, fe vce(cluster gid)')
cat("\n")
cat('reg d.y d.x i.year')
cat("\n")
cat('reg d.y d.x i.year, vce(cluster gid)')
sink()
## RUN Stata file