Description Usage Arguments Details Value Author(s) References See Also Examples
Implements the generalized synthetic control method based on interactive fixed effect models.
1 2 3 4 5 6 7 8  gsynth(formula=NULL, data, Y, D, X = NULL, na.rm = FALSE,
index, weight = NULL, force = "unit", cl = NULL, r = 0,
lambda = NULL, nlambda = 10, CV = TRUE, criterion = "mspe",
k = 5, EM = FALSE, estimator = "ife",
se = FALSE, nboots = 200,
inference = "nonparametric", cov.ar = 1, parallel = TRUE,
cores = NULL, tol = 0.001, seed = NULL, min.T0 = 5,
alpha = 0.05, normalize = FALSE)

formula 
an object of class "formula": a symbolic description of the model to be fitted. 
data 
a data frame (must be with a dichotomous treatment but balanced is not required). 
Y 
outcome. 
D 
treatment. 
X 
timevarying covariates. 
na.rm 
a logical flag indicating whether to listwise delete missing data. The algorithm will report an error if missing data exist. 
index 
a twoelement string vector specifying the unit (group) and time indicators. Must be of length 2. 
weight 
a string specifying the weighting variable(if any) to estimate
the weighted average treatment effect. Default is 
force 
a string indicating whether unit or time fixed effects will be imposed. Must be one of the following, "none", "unit", "time", or "twoway". The default is "unit". 
cl 
a string indicator the cluster variable. The default value is

r 
an integer specifying the number of factors. If 
lambda 
a single or sequence of positive numbers specifying the
hyperparameter sequence for matrix completion method. If 
nlambda 
an integer specifying the length of hyperparameter sequence
for matrix completion method. Default is 
CV 
a logical flag indicating whether crossvalidation will be
performed to select the optimal number of factors or hyperparameter
in matrix completion algorithm. If 
criterion 
a string specifying the criteria used for determining the number
of factors. Choose from 
k 
a positive integer specifying crossvalidation times for matrix
completion algorithm. Default is 
EM 
a logical flag indicating whether an Expectation Maximization algorithm will be used (Gobillon and Magnac 2016). 
estimator 
a string that controls the estimation method, either "ife" (interactive fixed effects) or "mc" (the matrix completion method). 
se 
a logical flag indicating whether uncertainty estimates will be produced. 
nboots 
an integer specifying the number of bootstrap
runs. Ignored if 
inference 
a string specifying which type of inferential method
will be used, either "parametric" or "nonparametric". "parametric" is
recommended when the number of treated units is small. parametric bootstrap
is not valid for matrix completion method. Ignored if 
cov.ar 
an integer specifying order of the auto regression process that the residuals follow. Used for parametric bootstrap procedure when data is in the form of unbalanced panel. The default value is 1. 
parallel 
a logical flag indicating whether parallel computing
will be used in bootstrapping and/or crossvalidation. Ignored if

cores 
an integer indicating the number of cores to be used in parallel computing. If not specified, the algorithm will use the maximum number of logical cores of your computer (warning: this could prevent you from multitasking on your computer). 
tol 
a positive number indicating the tolerance level. 
seed 
an integer that sets the seed in random number
generation. Ignored if 
min.T0 
an integer specifying the minimum value of pretreatment
periods. Treated units with pretreatment periods less than that will
be removed automatically. This item is important for unbalanced panels.
If users want to perform cross validation procedure to select the optimal
number of factors from 
alpha 
a positive number in the range of 0 and 1 specifying significant
levels for uncertainty estimates. The default value is 
normalize 
a logic flag indicating whether to scale outcome and
covariates. Useful for accelerating computing speed when magnitude of
data is large. The default is 
gsynth
implements the generalized synthetic control method. It
imputes counterfactuals for each treated unit using control group
information based on a linear interactive fixed effects model that
incorporates unitspecific intercepts interacted with timevarying
coefficients. It generalizes the synthetic control method to the case
of multiple treated units and variable treatment periods, and improves
efficiency and interpretability. It allows the treatment to be
correlated with unobserved unit and time heterogeneities under
reasonable modeling assumptions. With a builtin crossvalidation
procedure, it avoids specification searches and thus is easy to
implement. Data must be with a dichotomous treatment.
Y.dat 
a matrix storing data of the outcome variable. 
Y 
name of the outcome variable. 
D 
name of the treatment variable. 
X 
name of the timevarying control variables. 
index 
name of the unit and time indicators. 
id 
a vector of unit IDs. 
time 
a vector of time periods. 
obs.missing 
a matrix storing status of each unit at each time point.

id.tr 
a vector of IDs for the treatment units. 
id.co 
a vector of IDs for the control units. 
removed.id 
a vector of IDs for units that are removed. 
D.tr 
a matrix of treatment indicator for the treated unit outcome. 
I.tr 
a matrix of observation indicator for the treated unit outcome. 
Y.tr 
data of the treated unit outcome. 
Y.ct 
predicted counterfactuals for the treated units. 
Y.co 
data of the control unit outcome. 
eff 
difference between actual outcome and predicted Y(0). 
Y.bar 
average values of Y.tr, Y.ct, and Y.co over time. 
att 
average treatment effect on the treated over time (it is averaged based on the timing of the treatment if it is different for each unit). 
att.avg 
average treatment effect on the treated. 
force 
user specified 
sameT0 
TRUE if the timing of the treatment is the same. 
T 
the number of time periods. 
N 
the total number of units. 
p 
the number of timevarying observables. 
Ntr 
the number of treated units. 
Nco 
the number of control units. 
T0 
a vector that stores the timing of the treatment for balanced panel data. 
tr 
a vector indicating treatment status for each unit. 
pre 
a matrix indicating the pretreatment/nontreatment status. 
post 
a matrix indicating the posttreatment status. 
r.cv 
the number of factors included in the model – either supplied by users or automatically chosen via crossvalidation. 
lambda.cv 
the optimal hyperparameter in matrix completion method chosen via crossvalidation. 
res.co 
residuals of the control group units. 
beta 
coefficients of timevarying observables from the interactive fixed effect model. 
sigma2 
the mean squared error of interactive fixed effect model. 
IC 
the information criterion. 
PC 
the proposed criterion for determining factor numbers. 
est.co 
result of the interactive fixed effect model based on
the control group data. An 
eff.cnt 
difference between actual outcome and predicted Y(0); rearranged based on the timing of the treatment. 
Y.tr.cnt 
data of the treated unit outcome, rearranged based on the timing of the treatment. 
Y.ct.cnt 
data of the predicted Y(0), rearranged based on the timing of the treatment. 
MSPE 
mean squared prediction error of the crossvalidated model. 
CV.out 
result of the crossvalidation procedure. 
niter 
the number of iterations in the estimation of the interactive fixed effect model. 
factor 
estimated timevarying factors. 
lambda.co 
estimated loadings for the control group. 
lambda.tr 
estimated loadings for the treatment group. 
wgt.implied 
estimated weights of each of the control group unit for each of the treatment group unit. 
mu 
estimated ground mean. 
xi 
estimated time fixed effects. 
alpha.tr 
estimated unit fixed effects for the treated units. 
alpha.co 
estimated unit fixed effects for the control units. 
validX 
a logic value indicating if multicollinearity exists. 
inference 
a string indicating bootstrap procedure. 
est.att 
inference for 
est.att.avg 
inference for 
est.beta 
inference for 
est.ind 
inference for 
att.avg.boot 
bootstrap results for 
att.boot 
bootstrap results for 
beta.boot 
bootstrap results for 
Yiqing Xu <yiqingxu@stanfprd.edu>, Stanford University
Licheng Liu <liulch@mit.edu>, M.I.T.
Laurent Gobillon and Thierry Magnac, 2016. "Regional Policy Evaluation: Interactive Fixed Effects and Synthetic Controls." The Review of Economics and Statistics, July 2016, Vol. 98, No. 3, pp. 535–551.
Yiqing Xu. 2017. "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models." Political Analysis, Vol. 25, Iss. 1, January 2017, pp. 5776.
Athey S, Bayati M, Doudchenko N, et al. Matrix completion methods for causal panel data models[J]. arXiv preprint arXiv:1710.10251, 2017.
For more details, see https://yiqingxu.org/packages/gsynth/gsynth_examples.html.
For more details about the matrix completion method, see https://github.com/susanathey/MCPanel.
1 2 3 4 5 6 
Crossvalidating ...
r = 0; sigma2 = 4.43251; IC = 1.48897; MSPE = 2.37280
r = 1; sigma2 = 1.42977; IC = 0.75261; MSPE = 1.71737
r = 2; sigma2 = 0.93928; IC = 0.71688; MSPE = 1.14525*
r = 3; sigma2 = 0.88977; IC = 1.03647; MSPE = 1.15013
r = 4; sigma2 = 0.83864; IC = 1.34035; MSPE = 1.21356
r = 5; sigma2 = 0.79605; IC = 1.64062; MSPE = 1.23830
r* = 2Call:
gsynth.formula(formula = Y ~ D + X1 + X2, data = simdata, index = c("id",
"time"), force = "twoway", r = c(0, 5), CV = TRUE, se = FALSE,
parallel = FALSE)
Average Treatment Effect on the Treated:
[1] 5.543
~ by Period (including Pretreatment Periods):
[1] 0.392362 0.276705 0.274839 0.440915 0.889404 0.593545 0.527967
[8] 0.170937 0.611304 0.170545 0.272413 0.094706 0.651881 0.573748
[15] 0.469730 0.077895 0.141584 0.156434 0.914963 0.003591 1.236193
[22] 1.629796 2.711476 3.466495 5.739896 5.280624 8.435863 7.839314
[29] 9.454987 9.638255
Coefficients for the Covariates:
[,1]
X1 1.021
X2 3.052
Uncertainty estimates not available.
Call:
gsynth.formula(formula = Y ~ D + X1 + X2, data = simdata, index = c("id",
"time"), force = "twoway", r = c(0, 5), CV = TRUE, se = FALSE,
parallel = FALSE)
Average Treatment Effect on the Treated:
[1] 5.543
~ by Period (including Pretreatment Periods):
[1] 0.392362 0.276705 0.274839 0.440915 0.889404 0.593545 0.527967
[8] 0.170937 0.611304 0.170545 0.272413 0.094706 0.651881 0.573748
[15] 0.469730 0.077895 0.141584 0.156434 0.914963 0.003591 1.236193
[22] 1.629796 2.711476 3.466495 5.739896 5.280624 8.435863 7.839314
[29] 9.454987 9.638255
Coefficients for the Covariates:
[,1]
X1 1.021
X2 3.052
Uncertainty estimates not available.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.