Description Usage Arguments Value References See Also Examples

View source: R/crossvalidation.R

Reports treatment effect and cross-validation errors for estimators of the form of the matching and synthetic control (masc) estimator of Kellogg, Mogstad, Pouliot, and Torgovitsky (2019). For a set of masc-type estimators defined by a synthetic control estimator, a matching estimator (m) and a weight (phi), this function returns output associated with the masc estimator constructed by placing a weight of phi on the matching estimator and (1-phi) on the synthetic control estimator.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ```
masc_by_phi(
treated,
donors,
treated.covariates = NULL,
donors.covariates = NULL,
treatment = NULL,
sc_est = sc_estimator,
match_est = NearestNeighbors,
tune_pars = list(min_preperiods = NULL, set_f = NULL, m = NULL, phis = seq(from = 0,
to = 1, length.out = 100)),
cv_pars = list(forecast.minlength = 1, forecast.maxlength = 1),
treatinterval = NULL,
...
)
``` |

`treated` |
A |

`donors` |
A |

`treatment` |
An integer. The period T' in which forecasting begins.
If |

`sc_est` |
A |

`tune_pars` |
A - m:
a vector of integers. Denotes the set of nearest neighbor estimators from which we are allowed to pick. E.g., `tune_pars_list$m=c(1,3,5)` would allow us to pick from 1-NN, 3-NN, or 5-NN. Alternatively,`tune_pars_list$m` permits a logical vector. In this case, e.g.,`tune_pars_list$m=c(FALSE,TRUE,TRUE)` would allow us to pick from 2-NN or 3-NN. If`NULL` , we default to allowing all possible nearest neighbor estimators.- min_preperiods:
an integer. The smallest number of estimation periods allowed in a fold used for cross-validation. We use all folds from fold `min_preperiods` up to the latest possible fold`treatment-2` .- set_f:
a `list` containing a single element, a vector of integers. Identifies the set of folds used for cross-validation. As above, each integer identifies a fold by the last time period it uses in estimation. E.g., set_f=c(7,8,9) would implement cross-validation using fold 7, fold 8, and fold 9.
If neither |

`treatinterval` |
A vector. Indicates the post-treatment periods used when reporting average
treatment effects in the column |

`nogurobi` |
A logical value. If true, uses LowRankQP to solve the synthetic control estimator,
rather than |

`phivals` |
A vector of real values between 0 and 1. Indexes a weighted
average of the synthetic control estimator with a matching estimator, where |

Returns a `data.frame`

with each row defined by a value of `m`

and `phi`

taken respectively
from `tune_pars$m`

and `tune_pars$phivals`

. The columns
`cv.error`

and `pred`

return respectively the cross-validation error and
a measure of prediction error (AKA treatment effect) associated with the masc estimator defined by `m`

and `phi`

.

Kellogg, M., M. Mogstad, G. Pouliot, and A. Torgovitsky. Combining Matching and Synthetic Control to Trade off Biases from Extrapolation and Interpolation. Working Paper, 2019.

Other masc functions:
`cv_masc()`

,
`masc()`

,
`sc_estimator()`

,
`solve_masc()`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | ```
##Example: Terrorism in the Basque Region, from
##Abadie and Gardeazabal (2003).
#First, load the Synth package, which includes the dataset:
if (requireNamespace("Synth",quietly=TRUE) & requireNamespace("data.table",quietly=TRUE)){
library(Synth)
library(data.table)
data(basque)
basque<-as.data.table(basque)
basque <- basque[regionno!=1,]
basque[,regionname:= gsub(" (.*)","",regionname)]
#Grabbing region names:
names<- c(unique(basque[regionno==17,regionname]),unique(basque[regionno!=17,regionname]))
basque <- cbind(basque[regionno==17,gdpcap],
t(reshape(basque[regionno!=17,.(regionno,year,gdpcap)],
idvar='regionno', timevar='year',direction='wide')[,-"regionno",with=FALSE]))
result <- masc(treated=basque[,1], donors=basque[,-1],treatment=16, tune_pars_list=list(m=1:10,
min_preperiods=8))
names(result$weights)<-names[-1]
#weights on control units:
print(round(result$weights,3))
#Treatment effects of terrorism on GDP per capita
#in thousands of 1986 US dollars, over 1970-1975:
#(first 6 years of treatment)
print(result$pred.error[1:6,])
#Selected tuning parameters?
print(paste0("Selected matching estimator: ",result$m_hat))
print(paste0("Selected weight on matching: ",result$phi_hat))
#Now, examine the shape of A) the CV error (mean square prediciton error in pre-period) and
# B) average prediction error (AKA treatment effect) over the first 5 treatment years,
#both over values of phi, fixing the matching estimator (moving from matching to synthetic controls)
phis<-seq(0,1,length.out=100)
phi_table<-masc_by_phi(treated=basque[,1], donors=basque[,-1],treatment=16, tune_pars=list(m=result$m_hat,
min_preperiods=8,phis=phis))
#Printing CV error and prediction error over values of phi. CV error is clearly lowest at intermediary values of phi,
#suggesting an estimator between matching and synthetic controls does best at forecasting. The average medium-run treatment
#effect is monotonically increasing as we move away from synthetic control and toward matching.
print(phi_table)
}
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.