cv_masc: Solve for optimal model averaging parameter and associated...

Description Usage Arguments Value References See Also

View source: R/crossvalidation.R

Description

Implements the matching and synthetic control (masc) estimator of Kellogg, Mogstad, Pouliot, and Torgovitsky (2019), conditional on a given matching estimator characterized by m. masc loops over evaluations of this function for each candidate matching estimator, and selects the one which minimizes cross-validation error.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cv_masc(
  treated,
  donors,
  treatment = NULL,
  sc_est = sc_estimator,
  tune_pars = list(min_preperiods = NULL, set_f = NULL, m = NULL, weights_f = NULL),
  cv_pars = list(forecast.minlength = 1, forecast.maxlength = 1),
  nogurobi = FALSE,
  phival = NULL
)

Arguments

treated

A Tx1 matrix of outcomes for the treated unit.

donors

A TxN matrix of outcome paths for untreated units, each column being a control unit.

treatment

An integer. The period T' in which forecasting begins. If NULL or T'>T, then we assume all data is pre-treatment.

sc_est

A function which constructs weights associated with a synthetic control-type estimator. See sc_estimator for input and output if you'd prefer to substitute your own estimator.

tune_pars

A list containing 3 elements. You must specify the first, and you may specify only one of the last two elements. The last two elements describe the folds we include in the cross-validation procedure. Each fold f is denoted by the last period it uses for estimation. That is, fold f will fit estimators using data from period 1 through period f, and forecast into period f+1.

m:

an integer representing the nearest neighbor estimator used.

min_preperiods:

an integer. The smallest number of estimation periods allowed in a fold used for cross-validation. We use all folds from fold min_preperiods up to the latest possible fold treatment-2.

set_f:

a list containing a single element, a vector of integers. Identifies the set of folds used for cross-validation. As above, each integer identifies a fold by the last time period it uses in estimation. E.g., set_f=c(7,8,9) would implement cross-validation using fold 7, fold 8, and fold 9.

If neither min_preperiods nor set_f are specified, then we set min_preperiods to ceiling(treatment/2). In other words, we pick the first cross-validation fold so that it is estimated on the first half of the pre-period data.

cv_pars

A list containing 2 integer elements, forecast.minlength and forecast.maxlength. Cross-validation fold f will forecast into periods f+forecast.minlength and up to period f+forecast.maxlength or the treatment period (whichever comes first). If f+forecast.minlength lies in the treatment interval for one of the folds f given by the user, then masc returns an error.

nogurobi

A logical value. If true, uses LowRankQP to solve the synthetic control estimator, rather than gurobi.

phival

A real value between 0 and 1. If specified, hard-codes the masc estimator to take the specified weighted average of matching and synthetic controls, where phival indicates the weight on matching (1-phival being the weight on synthetic controls).

Value

returns a list containing five objects:

phi_hat:

selected value for the model averaging parameter (1 is pure matching, 0 pure synthetic control).

m_hat:

selected matching estimator (number of nearest neighbor).

weights:

The vector length N containing weights placed on each control unit.

pred.error:

The vector of treatment effects implied by the masc counterfactual, for periods T' to T.

cv.error:

The average (weighted by weights_f) of the cross-validation errors generated by each fold.

cv.error.byfold:

The cross-validation error generated by each fold.

References

Kellogg, M., M. Mogstad, G. Pouliot, and A. Torgovitsky. Combining Matching and Synthetic Control to Trade off Biases from Extrapolation and Interpolation. Working Paper, 2019.

See Also

Other masc functions: masc_by_phi(), masc(), sc_estimator(), solve_masc()


maxkllgg/masc documentation built on Sept. 1, 2020, 5:35 p.m.