synthetic_control: synthetic_control

View source: R/main.R

synthetic_controlR Documentation

synthetic_control

Description

synthetic_control() declares the input data frame for use in the synthetic control method. Allows for the specification of the panel units along with the intervention unit and time (treated). All units that are not the designated treated units are entered into the donor pool from which the synthetic control is generated. All time points prior and equal to the intervention time are designated as the pre-intervention period; and all time periods after are the post-intervention period.

Usage

synthetic_control(
  data = NULL,
  outcome = NULL,
  unit = NULL,
  time = NULL,
  i_unit = NULL,
  i_time = NULL,
  generate_placebos = TRUE
)

Arguments

data

panel data frame in long format (i.e. unit of analysis is unit-time period, such as country-year) containing both treated and control donor pool units. All units/time periods that are not desired to be in the donor should be excluded prior to passing to synthetic_control().

outcome

Name of the outcome variable. Outcome variable should be a continuous measure that is observed across multiple time points.

unit

Name of the case unit variable in the panel data.

time

Name of the time unit variable in the panel data.

i_unit

Name of the treated case unit where the intervention occurred.

i_time

Name of the treated time period when the intervention occurred.

generate_placebos

logical flag requesting that placebo versions of the data be generated for downstream inferential methods. Generates a version of the nested data where each control unit is the intervention unit. Default is TRUE.

Details

Note that synthetic_control() also allows for the simultaneous generation of placebo units (i.e. units where the treated unit is one of the controls). The addition of the placebo units increases computation time (as a synthetic control needs to be generated for each placebo unit) but it allows for inference as outlined in Abadie et al. 2010.

Value

tbl_df with nested fields containing the following:

  • .id: unit id for the intervention case (this will differ when a placebo unit).

  • .placebo: indicator field taking on the value of 1 if a unit is a placebo unit, 0 if it's the specified treated unit.

  • .type: type of the nested data construct: treated or controls. Keeps tract of which data construct is located in .outcome field.

  • .outcome: nested data construct containing the outcome variable configured for the sythnetic control method. Data is configured into a wide formate for the optimization task.

  • .original_data: original impute data filtered by treated or control units. This allows for easy processing down stream when generating predictors.

  • .meta: stores information regarding the unit and time index, the treated unit and time and the name of the outcome variable. Used downstream in subsequent functions.

Examples


############################
###### Basic Example #######
############################


# Smoking example data
data(smoking)

# initial the synthetic control object
smoking_out <-
smoking %>%
synthetic_control(outcome = cigsale,
                  unit = state,
                  time = year,
                  i_unit = "California",
                  i_time = 1988,
                  generate_placebos= FALSE)

# data configuration
dplyr::glimpse(smoking_out)

# Grap the organized outcome variables
smoking_out %>% grab_outcome(type = "treated")
smoking_out %>% grab_outcome(type = "controls")


###################################
####### Full implementation #######
###################################


# Smoking example data
data(smoking)

smoking_out <-
smoking %>%

# initial the synthetic control object
synthetic_control(outcome = cigsale,
                  unit = state,
                  time = year,
                  i_unit = "California",
                  i_time = 1988,
                  generate_placebos= FALSE) %>%

# Generate the aggregate predictors used to generate the weights
  generate_predictor(time_window=1980:1988,
                     lnincome = mean(lnincome, na.rm = TRUE),
                     retprice = mean(retprice, na.rm = TRUE),
                     age15to24 = mean(age15to24, na.rm = TRUE)) %>%

  generate_predictor(time_window=1984:1988,
                     beer = mean(beer, na.rm = TRUE)) %>%

  generate_predictor(time_window=1975,
                     cigsale_1975 = cigsale) %>%

  generate_predictor(time_window=1980,
                     cigsale_1980 = cigsale) %>%

  generate_predictor(time_window=1988,
                     cigsale_1988 = cigsale) %>%


  # Generate the fitted weights for the synthetic control
  generate_weights(optimization_window =1970:1988,
                   Margin.ipop=.02,Sigf.ipop=7,Bound.ipop=6) %>%

  # Generate the synthetic control
  generate_control()

# Plot the observed and synthetic trend
smoking_out %>% plot_trends(time_window = 1970:2000)





tidysynth documentation built on May 31, 2023, 6:13 p.m.