knitr::opts_chunk$set( collapse = TRUE, comment = NA, echo = TRUE, message = FALSE, error = TRUE, eval = TRUE, out.width = "100%", fig.width = 7, fig.height = 5, dev = "png", dpi = 300 )
The getspanel
package can be downloaded and installed from CRAN here by simply using:
install.packages("getspanel")
The source code of the package is on GitHub and the development version can be installed using:
# install.packages("devtools") devtools::install_github("moritzpschwarz/getspanel", ref = "devel")
Once installed we need to load the library:
library(getspanel) library(fixest)
Currently the package is called getspanel to align with the gets package, but it's main function of course remains the isatpanel function.
The isatpanel function implements the empirical break detection algorithm that is described in a paper by Felix Pretis and Moritz Schwarz and was applied to a study by Nico Koch and colleagues on EU Road CO~2~ emissions, which was published in Nature Energy in 2022.
A quick overview over what has changed:
We can now use the function approach as well as the traditional gets approach. This means that we can specify a model using y
and mxreg
as well as time
and id
as vectors, but we can now also simply supply a data.frame
and a function
in the form y ~ x + z + I(x^2)
to e.g. specify polynomials. This means we will then need an index
argument, which specifies the
The ar
argument now works
We can now use the fixest
package to speed up model estimation with large i
(for short panels, the default method is still faster).The package can be activated using the new engine
argument.
Using the fixest
package also allows us to calculate clustered standard errors.
We can now be certain that unbalanced panels would work as intended, which was not the case before.
The mxbreak
and break.method
arguments have been removed. Instead the function now produces the break matrix itself. This now implements the following saturation methods in a user friendly way:
iis: Impulse Indicator Saturation
jsis: Joint Step Indicator Saturation (Common Breaks over time)
csis: Coefficient Step Indicator Saturation (Common Coefficient Breaks over time)
fesis: Fixed Effect Step Indicator Saturation (Breaks in the Group Fixed Effect over time)
cfesis: Coefficient Fixed Effect Step Indicator Saturation (Breaks in the coefficient for each individual)
We first load some data of EU CO2 Emissions in the housing sector.
data("EUCO2residential") head(EUCO2residential) # let's subset this a little bit to speed this up EUCO2residential <- EUCO2residential[EUCO2residential$year > 2000 & EUCO2residential$country %in% c("Germany", "Austria", "Belgium", "Italy", "Sweden", "Denmark"),] # let's create a log emissions per capita variable EUCO2residential$lagg.directem_pc <- log(EUCO2residential$agg.directem/EUCO2residential$pop) # and let's also turn off printing the intermediate output from isatpanel options(print.searchoutput = FALSE)
Let's look at how we input what we want to model. Each isatpanel
command takes:
i. In the gets package style i.e. using vectors and matrices to specify y
, mxreg
, time
and id
ii. But also in a form that resembles the lm
and plm
specification i.e. inputting a data.frame
(or matrix
or tibble
), a formula
argument as well as character vectors for index
(in the form c("group_variable_name", "time_variable_name")
)
effect
.This already means that the following two commands will give the same result:
Using the new method
is_lm <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE)
Using the traditional method
is_gets <- isatpanel(y = EUCO2residential$lagg.directem_pc, mxreg = EUCO2residential$lgdp, time = EUCO2residential$year, id = EUCO2residential$country, effect = "twoways", fesis = TRUE)
From here onwards, I will use the lm
notation.
We can plot these simply using the default plotting methods (rely on the ggplot2 package):
plot(is_lm)
plot_grid(is_lm)
plot_counterfactual(is_lm)
This argument works just as in the gets package. The method simply adds a 0
and 1
dummy for each observation.
Simply set iis = TRUE
.
iis_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", iis = TRUE, fesis = TRUE)
plot(iis_example)
Traditional Step Indicator Saturation does not make sense in a panel setting. Therefore, the gets function of sis
is disabled.
It is possible, however, to consider Step Indicator Saturation with common breaks across individuals. Such indicators would be collinear, if effects = c("twoways")
or effects = c("time")
i.e. if Time Fixed Effects are included.
If, however, effect = "individual"
then we can use jsis = TRUE
to select over all individual time fixed effects.
jsis_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "individual", jsis = TRUE)
Note: This method has only been tested using the lm
implementation (using data
, formula
, and index
).
This method allows detection of coefficient breaks that are common across all groups. It is the interaction between jsis
and the relevant coefficient.
To illustrate this, as well as the advantages of using the lm
approach, we include a non-linear term of the lgdp variable using I(lgdp^2)
:
csis_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", t.pval = 0.05, csis = TRUE)
By default, all coefficients will be interacted and added to the indicator list - but his can be controlled using the csis_var
, which takes a character vector of column names i.e. csis_var = "lgdp"
.
csis_example2 <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", csis = TRUE, csis_var = "lgdp")
This is equivalent to supplying a constant to the mxbreak argument in the old method. This essentially breaks the group-specific intercept i.e. the individual fixed effect.
fesis_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE)
plot(fesis_example)
Similar to the csis_var
idea, we can specify the fesis
method for a subset of individuals as well using the fesis_id
variable, which takes a character vector of individuals. In this case we can use e.g. fesis_id = c("Austria","Denmark")
.
fesis_example2 <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE, fesis_id = c("Austria","Denmark"))
plot(fesis_example2)
The options for the robust_isatpanel
are to use HAC Standard Errors, use a standard White Standard Error Correction (with the option of clustering the S.E. within groups or time):
robust_isatpanel(fesis_example, HAC = TRUE, robust = TRUE, cluster = "group")
This method combines the csis
and the fesis
approach and detects whether coefficients for individual units break over time.
This means we can also combine the subsetting in both the variable and in the individual units using cfesis_id
and cfesis_var
.
cfesis_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", cfesis = TRUE, cfesis_id = c("Belgium","Germany"), cfesis_var = "lgdp", t.pval = 0.001)
plot(cfesis_example)
ar
argumentIt is now possible to specify an argument to include autoregressive coefficients, using the ar
argument.
fesis_ar1_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE, ar = 1)
engine
argumentAnother new argument is also the engine
argument. This allows us to use an external package to estimate our models. At this stage, the fixest package can be used.
This also means that we can now use an argument to cluster Standard Errors using cluster
. The following few chunks are not executed by default in the vignette.
fixest_example <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE, engine = "fixest", cluster = "none")
We can verify that, using no clustering of Standard Errors at all, using the fixest package does not change our estimates:
head(fixest_example$isatpanel.result$mean.results)
Compared to the default estimator:
head(is_lm$isatpanel.result$mean.results)
However, changing the cluster
specification of course does. The Standard Error correction with it's current implementation is not valid, so allows for many more indicators than true - clustering is therefore currently not recommended.
fixest_example_cluster <- isatpanel(data = EUCO2residential, formula = lagg.directem_pc ~ lgdp + I(lgdp^2) + pop, index = c("country","year"), effect = "twoways", fesis = TRUE, engine = "fixest", cluster = "individual")
plot(fixest_example_cluster)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.