Matched Set Objects

knitr::opts_chunk$set(echo = TRUE)

This vignette will take a more detailed look at the matched.set object, which is the core object within the package that captures all of the information associated with treated units and their matched control units.

First, we will create a smaller subset of the dem data set, which is included in the package. This is just to make some of our results easier to read. We then use the DisplayTreatment function to get a sense of treatment variation within the subset of data.

library(PanelMatch)
uid <-unique(dem$wbcode2)[1:10]
subdem <- dem[dem$wbcode2 %in% uid, ]
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem)

PanelMatch and matched.set objects

We can use the PanelMatch function with refinement.method set to none to obtain a PanelMatch object, from which we will extract a matched.set object. PanelMatch returns an S3 object of the PanelMatch class. These objects are just lists with some additional attributes. Here, we will focus on one element contained within PanelMatch objects: matched.set objects. Within the PanelMatch object, this element is always named either att, art, or atc reflecting the name of the specified qoi. When qoi = ate, then there are two matched.set objects included in the resulting PanelMatch call. Specifically, there will be two matched sets named att and atc, respectively.

PM.results <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2", 
                         treatment = "dem", refinement.method = "none", 
                         data = subdem, match.missing = TRUE, 
                         qoi = "att" ,outcome.var = "y",
                         lead = 0, forbid.treatment.reversal = FALSE)

#Extract the matched.set object 
msets <- PM.results$att                         

What are matched.set objects?

In implementation, the matched.set is just a named list with some added attributes (e.g. lag window size, names of treatment, unit, and time variables) and a structured name scheme. Each entry in the list is a vector containing the unit ids of control units that are in a matched set. Additionally, each entry corresponds to a time/unit id pair (the unit id of a treated unit and the time at which treatment occurred). This is reflected in the names of each element of the list, as the name scheme [id varable].[time variable] is used.

Matched set objects are implemented as lists, but the default printing behavior resembles that of a data frame. One can toggle a verbose option on the print method to print as a list and also display a less summarized version of the matched set data.

# observe the naming scheme
names(msets)

#data frame printing view: useful as a summary view with large data sets
# first column is unit id variable, second is time variable, and 
# third is the number of controls in that matched set
print(msets)

# prints as a list, shows all data at once
print(msets, verbose = TRUE)

Control unit weights

Note that in the verbose print view above, one can see that each control unit will have an associated weight. These are the weights that are assigned from the refinement process. In this example, no refinement has been applied so each control unit in a matched set will have equal weight. See the Using PanelMatch vignette for more about refinement.

Weights are attributes for each element in the matched.set list. As such, they can also be extracted as follows:

# weights for control units in first matched set
attr(msets[[1]], "weights")

Note that this returns a vector of weights. The names of each element in the vector corresponds to the control unit that weight is associated with. Weights are normalized should always sum to 1 within each matched set.

Subsetting matched.set objects.

The '[' and '[[' operators are implemented for matched.set objects and should work intuitively.

Using '[' returns a subsetted matched.set object (list). The additional attributes will be copied and transferred as well with the custom operator. Note how, by default, it prints like the full form of the matched.set. Using '[[' will return the unit ids of the control units in the specified matched set.

Since matched.set objects are just lists with attributes, you can expect the [ and [[ functions to work similarly to how they would with a list. So, for instance, users can extract information about matched sets using numerical indices or by taking advantage of the naming scheme.

msets[1]

#prints the control units in this matched set
msets[[1]]


msets["4.1992"] #equivalent to msets[1]

msets[["4.1992"]] #equivalent to msets[[1]]

Plotting with matched.set objects

Calling plot on a matched.set object will display a histogram of the sizes of the matched sets. By default, the number of empty matched sets (treated unit/time id pairs with no suitable controls for a match) is noted with a vertical bar at x = 0. One can include empty sets in the histogram by setting the include.empty.sets argument to TRUE.

plot(msets, xlim = c(0, 4))

# Use full data
PM.results.full <- PanelMatch(lag = 4, time.id = "year", unit.id = "wbcode2", 
                         treatment = "dem", refinement.method = "none", 
                         data = dem, match.missing = TRUE, 
                         qoi = "att" ,outcome.var = "y",
                         lead = 0, forbid.treatment.reversal = FALSE)

#Extract the matched.set object 
plot(PM.results.full$att)

Summarizing matched set data

The summary function provides a variety of information about the sizes of matched sets, the unit and time ids of treated units, the number of empty sets, and the lag window size. The summary function also has an option to print only the overview data frame. Toggle this by setting verbose = FALSE

print(summary(msets))

print(summary(msets, verbose = FALSE))

Matched sets with the DisplayTreatment function

Passing a matched set (one treated unit and its corresponding set of controls) to the DisplayTreatment function will visually highlight the lag window histories used to create that matched set. There is also an option to only display units from the matched set (and the treated unit), which can be achieved by setting show.set.only to TRUE.

#pass matched.set object using the `[` operator
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', data = subdem, matched.set = msets[1])

# only show matched set units
DisplayTreatment(unit.id = "wbcode2", time.id = "year", treatment = 'dem', 
                    data = subdem, matched.set = msets[1], 
                    show.set.only = TRUE, y.size = 15, x.size = 13)


Try the PanelMatch package in your browser

Any scripts or data that you put into this service are public.

PanelMatch documentation built on June 22, 2024, 10:32 a.m.