fect: Fixed Effects Counterfactual Estimators

View source: R/default.R

fectR Documentation

Fixed Effects Counterfactual Estimators

Description

Implements counterfactual estimators in TSCS data analysis and statistical tools to test their identification assumptions.

Usage

fect(formula = NULL, data, Y, D, X = NULL, group = NULL,
            na.rm = FALSE, 
            index, force = "two-way", r = 0, lambda = NULL, nlambda = 10,
            CV = NULL, k = 10, cv.prop = 0.1, cv.treat = FALSE, 
            cv.nobs = 3, cv.donut = 0, criterion = "mspe",
            binary = FALSE, QR = FALSE,
            method = "fe",  
            se = FALSE, vartype = "bootstrap", nboots = 200, alpha = 0.05,
            parallel = TRUE, cores = NULL, tol = 0.001, seed = NULL, 
            min.T0 = NULL, max.missing = NULL, 
            proportion = 0.3, pre.periods = NULL, 
            f.threshold = 0.5, tost.threshold = NULL,
            knots = NULL, degree = 2, 
            sfe = NULL, cfe = NULL,
            balance.period = NULL, fill.missing = FALSE,
            placeboTest = FALSE, placebo.period = NULL,
            carryoverTest = FALSE, carryover.period = NULL, carryover.rm = NULL,
            loo = FALSE, permute = FALSE, m = 2, normalize = FALSE)  

Arguments

formula

an object of class "formula": a symbolic description of the model to be fitted, e.g, Y~D+X1+X2

data

a data frame, can be a balanced or unbalanced panel data.

Y

the outcome indicator.

D

the treatment indicator. The treatment should be binary (0 and 1).

X

time-varying covariates. Covariates that have perfect collinearity with specified fixed effects are dropped automatically.

group

the group indicator. If specified, the group-wise ATT will be estimated.

na.rm

a logical flag indicating whether to list-wise delete missing observations. Default to FALSE. If na.rm = FALSE, it allows the situation when Y is missing but D is not missing for some observations. If na.rm = TRUE, it will list-wise delete observations whose Y, D, or X is missing.

index

a two-element string vector specifying the unit and time indicators. Must be of length 2. Every observation should be uniquely defined by the pair of the unit and time indicator.

force

a string indicating whether unit or time or both fixed effects will be imposed. Must be one of the following, "none", "unit", "time", or "two-way". The default is "two-way".

r

an integer specifying the number of factors. If CV = TRUE, the cross validation procedure will select the optimal number of factors from r to 5.

lambda

a single or sequence of positive numbers specifying the hyper-parameter sequence for matrix completion method. If lambda is a sequence and CV = 1, cross-validation will be performed.

nlambda

an integer specifying the length of hyper-parameter sequence for matrix completion method. Default is nlambda = 10.

CV

a logical flag indicating whether cross-validation will be performed to select the optimal number of factors or hyper-parameter in matrix completion algorithm. If r is not specified, the procedure will search through r = 0 to 5.

k

an integer specifying number of cross-validation rounds. Default is k = 10.

cv.prop

a numerical value specifying the proportion of testing set compared to sample size during the cross-validation procedure.

cv.treat

a logical flag speficying whether to only use observations of treated units as testing set.

cv.nobs

an integer specifying the length of continuous observations within a unit in the testing set. Default is cv.nobs = 3.

cv.donut

an integer specifying the length of removed observations at the head and tail of the continuous observations specified by cv.nobs. These removed observations will not be used to fit the data nor be in the validation set for the cross-validation, e.g, if cv.nobs=3 and cv.donut = 1, the first and the last observation in each triplet will not be included in the test set. Default is cv.donut = 0.

criterion

criterion used for model selection. Default is "mspe". "mspe" for the mean squared prediction error, "gmspe" for the geometric-mean squared prediction errors, if criterion="moment", we average the residuals in test sets by their relative periods to treatments and then average the squares of these period-wise deviations weighted by the number of observations at each period, it yields a better pre-trend fitting on test sets rather than a better prediction ability. "pc" for the information criterion of interactive fixed effects or generalized synthetic control model.

binary

This version doesn't support this option. a logical flag indicating whether a probit link function will be used.

QR

This version doesn't support this option. a logical flag indicating whether QR decomposition will be used for factor analysis in probit model.

method

a string specifying which imputation algorithm will be used. "fe" for fixed effects model, "ife" for interactive fixed effects model, "mc" for matrix copletion method, "polynomial" for polynomial trend terms, "bspline" for regression splines, "gsynth" for generalized synthetic control method, and "cfe" for complex fixed effects method Default is method = "fe".

se

a logical flag indicating whether uncertainty estimates will be produced.

vartype

a string specifying the type of variance estimator. Choose from vartype = c("bootstrap", "jackknife", "parametric"). Default value is "bootstrap".

nboots

an integer specifying the number of bootstrap runs. Ignored if se = FALSE.

alpha

significant level for hypothesis test and CIs. Default value is alpha = 0.05.

parallel

a logical flag indicating whether parallel computing will be used in bootstrapping and/or cross-validation. Ignored if se = FALSE.

cores

an integer indicating the number of cores to be used in parallel computing. If not specified, the algorithm will use the maximum number of logical cores of your computer (warning: this could prevent you from multi-tasking on your computer).

tol

a positive number indicating the tolerance level.

seed

an integer that sets the seed in random number generation. Ignored if se = FALSE and r is specified.

min.T0

an integer specifying the minimum value of observed periods that a unit is under control.

max.missing

an integer. Units with number of missing values greater than it will be removed. Ignored if this parameter is set "NULL"(i.e. max.missing = NULL, the default setting).

proportion

a numeric value specifying pre-treatment periods that have observations larger than the proportion of observations at period 0. These pre-treatment periods are used used for goodness-of-fit test. Ignore if se = FALSE. Deafult is proportion = 0.3.

pre.periods

a vector specifying the range of pre-treatment period used for goodness-of-fit test. If left blank, all pre-treatment periods specified by proportion will be used. Ignore if se = FALSE.

f.threshold

a numeric value specifying the threshold for the F-statistic in the equivalent test. Ignore if se = FALSE. Deafult is f.threshold = 0.5.

tost.threshold

a numeric value specifying the threshold for the two-one-sided t-test. If alpha=0.05, TOST checks whether the 90 The default value is 0.36 times the standard deviation of the outcome variable after two-way fixed effects are partialed out.

knots

a numeric vector speicfying the knots for b-spline curve trend term.

degree

an integer speifcying the order of either the b-spline or the polynomial trend term.

sfe

a vector specifying other fixed effects in addition to unit or time fixed effects that is used when method="cfe".

cfe

a vector of lists specifying interactive fixed effects when method="cfe". For each list, the value of the first element is the name of the group variable for which fixed effects are to be estimated. The value of the second element is the name of a regressor (e.g., a time trend).

balance.period

a vector of length 2 specifying the range of periods for a balanced sample which has no missing observation in the specified range.

fill.missing

a logical flag indicating whether to allow missing observations in this balanced sample. The default is FALSE.

placeboTest

a logic flag indicating whether to perform placebo test.

placebo.period

an integer or a two-element numeric vector specifying the range of pre-treatment periods that will be assigned as pseudo treatment periods.

carryoverTest

a logic flag indicating whether to perform (no) carryover test.

carryover.period

an integer or a two-element numeric vector specifying the range of post-treatment periods that will be assigned as pseudo treatment periods.

carryover.rm

an integer specifying the range of post-treatment periods that will be assigned as pseudo treatment periods.

loo

a logic flag indicating whether to perform the leave-one-period-out goodness-of-fit test, which is very time-consuming.

permute

a logic flag indicating whether to perform permutation test.

m

an integer specifying the block length in permutation test. Default value is m = 2.

normalize

a logic flag indicating whether to scale outcome and covariates. Useful for accelerating computing speed when magnitude of data is large. The default is normalize=FALSE.

Details

fect implements counterfactual estimators in TSCS data analysis. These estimators first impute counterfactuals for each treated observation in a TSCS dataset by fitting an outcome model (fixed effects model, interactive fixed effects model, or matrix completion) using the untreated observations. They then estimate the individualistic treatment effect for each treated observation by subtracting the predicted counterfactual outcome from its observed outcome. Finally, the average treatment effect on the treated (ATT) or period-specific ATTs are calculated. A placebo test and an equivalence test are included to evaluate the validity of identification assumptions behind these estimators. Data must be with a dichotomous treatment.

Value

Y.dat

a T-by-N matrix storing data of the outcome variable.

D.dat

a T-by-N matrix storing data of the treatment variable.

I.dat

a T-by-N matrix storing data of the indicator for whether is observed or missing.

Y

name of the outcome variable.

D

name of the treatment variable.

X

name of the time-varying control variables.

index

name of the unit and time indicators.

force

user specified force option.

T

the number of time periods.

N

the total number of units.

p

the number of time-varying observables.

r.cv

the number of factors included in the model – either supplied by users or automatically chosen via cross-validation.

lambda.cv

the optimal hyper-parameter in matrix completion method chosen via cross-validation.

beta

coefficients of time-varying observables from the interactive fixed effect model.

sigma2

the mean squared error of interactive fixed effect model.

IC

the information criterion.

est

result of the interactive fixed effect model based on observed values.

MSPE

mean squared prediction error of the cross-validated model.

CV.out

result of the cross-validation procedure.

niter

the number of iterations in the estimation of the interactive fixed effect model.

factor

estimated time-varying factors.

lambda

estimated loadings.

lambda.tr

estimated loadings for treated units.

lambda.co

estimated loadings for control units.

mu

estimated ground mean.

xi

estimated time fixed effects.

alpha

estimated unit fixed effects.

alpha.tr

estimated unit fixed effects for treated units.

alpha.co

estimated unit fixed effects for control units.

validX

a logic value indicating if multicollinearity exists.

validF

a logic value indicating if factor exists.

id

a vector of unit IDs.

rawtime

a vector of time periods.

obs.missing

a matrix stroing status of each unit at each time point.

Y.ct

a T-by-N matrix storing the predicted Y(0).

eff

a T-by-N matrix storing the difference between actual outcome and predicted Y(0).

res

residuals for observed values.

eff.pre

difference between actual outcome and predicted Y(0) for observations of treated units under control.

eff.pre.equiv

difference between actual outcome and predicted Y(0) for observations of treated units under control based on baseline (two-way fixed effects) model.

pre.sd

by period residual standard deviation for estimated pre-treatment average treatment effects.

att.avg

average treatment effect on the treated.

att.avg.unit

by unit average treatment effect on the treated.

time

term for switch-on treatment effect.

count

count of each term for switch-on treatment effect.

att

switch-on treatment effect.

time.off

term for switch-off treatment effect.

att.off

switch-off treatment effect.

count.off

count of each term for switch-off treatment effect.

att.placebo

average treatment effect for placebo period.

att.carryover

average treatment effect for carryover period.

eff.calendar

average treatment effect for each calendar period.

eff.calendar.fit

loess fitted values of average treatment effect for each calendar period.

N.calandar

number of treated observations at each calendar period.

balance.avg.att

average treatment effect for the balance sample.

balance.att

switch-on treatment effect for the balance sample.

balance.time

term of switch-on treatment effect for the balance sample.

balance.count

count of each term for switch-on treatment effect for the balance sample.

balance.att.placebo

average treatment effect for placebo period of the balance sample.

group.att

average treatment effect for different groups.

group.output

a list saving the switch-on treatment effects for different groups.

est.att.avg

inference for att.avg.

est.att.avg.unit

inference for att.avg.unit.

est.att

inference for att.on.

est.att.off

inference for att.off.

est.placebo

inference for att.placebo.

est.carryover

inference for att.carryover.

est.eff.calendar

inference for eff.calendar.

est.eff.calendar.fit

inference for eff.calendar.fit.

est.balance.att

inference for balance.att.

est.balance.avg

inference for balance.avg.att.

est.balance.placebo

inference for balance.att.placebo.

est.beta

inference for beta.

est.group.att

inference for group.att.

est.group.output

inference for group.output.

att.avg.boot

bootstrap results for att.avg.

att.avg.unit.boot

bootstrap results for att.avg.unit.

att.count.boot

bootstrap results for count.

att.off.boot

bootstrap results for att.avg.off.

att.off.count.boot

bootstrap results for count.off.

att.placebo.boot

bootstrap results for att.placebo.

att.carryover.boot

bootstrap results for att.carryover.

balance.att.boot

bootstrap results for balance.att.

att.bound

equivalence confidence interval for equivalence test.

att.off.bound

equivalence confidence interval for equivalence test for switch-off effect.

beta.boot

bootstrap results for beta.

test.out

goodness-of-fit test and equivalent test results for pre-treatment fitting check.

loo.test.out

leave-one-period-out goodness-of-fit test and equivalent test results for pre-treatment fitting check.

permute

permutation test results for sharp null hypothesis.

Author(s)

Licheng Liu; Ye Wang; Yiqing Xu; Ziyi Liu

References

Jushan Bai. 2009. "Panel Data Models with Interactive Fixed Effects." Econometrica.

Yiqing Xu. 2017. "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models." Political Analysis.

Athey, Susan, et al. 2021 "Matrix completion methods for causal panel data models." Journal of the American Statistical Association.

Licheng Liu, et al. 2022. "A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data." American Journal of Political Science.

For more details about the matrix completion method, see https://github.com/susanathey/MCPanel.

See Also

plot.fect and print.fect

Examples

library(fect)
data(fect)
out <- fect(Y ~ D + X1 + X2, data = simdata1, 
            index = c("id","time"), force = "two-way",
            CV = TRUE, r = c(0, 5), se = 0, parallel = FALSE) 

fect documentation built on Oct. 14, 2022, 5:06 p.m.