fect: Fixed Effects Counterfactual Estimators

View source: R/default.R

fectR Documentation

Fixed Effects Counterfactual Estimators

Description

Implements counterfactual estimators in TSCS data analysis and statistical tools to test their identification assumptions.

Usage

fect(formula = NULL, data, Y, D, X = NULL,
            W = NULL, W.est = NULL, W.agg = NULL,
            group = NULL,
            na.rm = FALSE,
            index, force = "two-way",
            time.component.from = "notyettreated", em = TRUE,
            r = 0, lambda = NULL, nlambda = 10,
            CV = NULL, k = 20, cv.prop = 0.1, cv.method = "rolling",
            cv.nobs = 3, cv.donut = 1, cv.buffer = 1, criterion = "mspe",
            binary = FALSE, QR = FALSE,
            method = "fe",  se = FALSE, vartype = "bootstrap", cl = NULL,
            quantile.CI = FALSE, nboots = 200, alpha = 0.05,
            parallel = TRUE, cores = NULL, tol = 1e-3,
            max.iteration = 1000, seed = NULL,
            min.T0 = NULL, max.missing = NULL,
            proportion = 0.3, pre.periods = NULL,
            f.threshold = 0.5, tost.threshold = NULL,
            knots = NULL, degree = 2,
            sfe = NULL, cfe = NULL,
            Z = NULL, gamma = NULL, Q = NULL, kappa = NULL, 
            Q.type = NULL,
            Q.bspline.degree = NULL,
            Z.param = NULL, Q.param = NULL,
            balance.period = NULL, fill.missing = FALSE,
            placeboTest = FALSE, placebo.period = NULL,
            carryoverTest = FALSE, carryover.period = NULL, carryover.rm = NULL,
            loo = FALSE, permute = FALSE, m = 2,
            normalize = FALSE, keep.sims = FALSE,
            cm = FALSE,
            loading.bound = "none", gamma.loading = NULL,
            gamma.loading.grid = NULL,
            cv.rule = "1se")

Arguments

formula

an object of class "formula": a symbolic description of the model to be fitted, e.g, Y~D+X1+X2

data

a data frame, can be a balanced or unbalanced panel data.

Y

the outcome indicator.

D

the treatment indicator. The treatment should be binary (0 and 1).

X

time-varying covariates. Covariates that have perfect collinearity with specified fixed effects are dropped automatically.

W

a string giving the column name of a weight variable. Convenience default that populates both W.est and W.agg when those are left NULL. Suitable for survey or sample weights, where the same column applies to both the outcome-model fit and the across-treated-obs aggregation.

W.est

a string giving the column name of a weight variable that enters the outcome-model fit (the weighted least squares applied inside the IFE / MC / CFE solver). When NULL, falls back to W. Use this (with W.agg = NULL) when the weight reflects fit-side considerations and the estimand is the unweighted average ATT across treated cells.

W.agg

a string giving the column name of a weight variable that enters the across-treated-obs aggregation (att.on, est.avg, est.att). When NULL, falls back to W. Use this (with W.est = NULL) when the user's estimand differs from "ATT for the treated units in the analysis sample" — common cases include calibration weights to a target population or post-stratification weights that should adjust the summary but not the model fit.

A clean in-package solution for inverse-probability weights for confounding adjustment is under development for fect 3.0 (a cross-fit doubly-robust path); W.agg is not a substitute for that work and does not deliver the doubly-robust properties an IPW user expects.

In v2.3.1, W.est and W.agg (when both supplied) must point to the same column; truly distinct columns for fit vs. aggregation (e.g. combined survey x IPW designs) are scheduled for v2.4.0.

group

the group indicator. If specified, the group-wise ATT will be estimated.

na.rm

a logical flag indicating whether to list-wise delete missing observations. Default to FALSE. If na.rm = FALSE, it allows the situation when Y is missing but D is not missing for some observations. If na.rm = TRUE, it will list-wise delete observations whose Y, D, or X is missing.

index

a character vector specifying the unit (first element) and time (second element) indicators. For most methods, must be of length 2. For method = "cfe", additional elements (third, fourth, etc.) specify extra fixed-effect grouping variables. Every observation should be uniquely defined by the pair of the unit and time indicator.

force

a string indicating whether unit or time or both fixed effects will be imposed. Must be one of the following, "none", "unit", "time", or "two-way". The default is "two-way".

time.component.from

Controls which units provide the time-varying model components (time fixed effects, factor structure, temporal dynamics). Options are "notyettreated" (default) — all units contribute during their pre-treatment periods, or "nevertreated" — only never-treated units estimate the time components, which are then projected onto treated units.

em

a logical flag indicating whether to use the EM algorithm for missing data in the estimation sample. Default is TRUE. Setting em = FALSE requires a complete estimation sample and is only compatible with time.component.from = "nevertreated".

r

an integer specifying the number of factors. If CV = TRUE, the cross validation procedure will select the optimal number of factors from r to 5.

lambda

a single or sequence of positive numbers specifying the hyper-parameter sequence for matrix completion method. If lambda is a sequence and CV = 1, cross-validation will be performed.

nlambda

an integer specifying the length of hyper-parameter sequence for matrix completion method. Default is nlambda = 10.

CV

a logical flag indicating whether cross-validation will be performed to select the optimal number of factors or hyper-parameter in matrix completion algorithm. If r is not specified, the procedure will search through r = 0 to 5.

k

an integer specifying number of cross-validation rounds. Default is k = 20.

cv.prop

a numerical value specifying the proportion of testing set compared to sample size during the cross-validation procedure.

cv.method

a string specifying the cross-validation masking strategy. One of "rolling" (default; standard time-series rolling-window CV), "block" (random scattered anchors with contiguous-block masking), or "loo" (leave-one-out, available for fect_nevertreated). The legacy aliases "all_units" (= "block") and "treated_units" (block masking restricted to treated pre-treatment cells) are still accepted but emit a deprecation message; both will be replaced by the unified (cv.method, cv.units) API in v2.4.0.

cv.nobs

an integer specifying the length of continuous observations within a unit in the testing set. Default is cv.nobs = 3.

cv.donut

an integer specifying the length of removed observations at the head and tail of the continuous observations specified by cv.nobs. Used by block CV (cv.method = "block", or the legacy aliases "all_units"/"treated_units"). Default is 1 (matches cv.buffer for rolling CV).

cv.buffer

an integer specifying the length of past-side buffer cells masked from training (but not scored) immediately before each rolling-window holdout. Used only by cv.method = "rolling"; the future side is dropped from training by construction. Analogous to cv.donut for block CV but applied only on the past side. Default is 1.

criterion

criterion used for model selection. Default is "mspe". "mspe" for the mean squared prediction error, "gmspe" for the geometric-mean squared prediction errors, "moment" for period-weighted residuals in test sets, "pc" for an information criterion method.

binary

This version doesn't support this option.

QR

This version doesn't support this option.

method

a string specifying which imputation algorithm will be used. "fe", "ife", "mc", "gsynth", or "cfe". Default is "fe".

se

a logical flag indicating whether uncertainty estimates will be produced.

vartype

a string specifying the type of variance estimator, e.g. "bootstrap". Three values are supported: "bootstrap" (nonparametric cluster-bootstrap; the safe default), "jackknife" (leave-one-unit-out), and "parametric" (two-stage pseudo-treated parametric bootstrap). The "parametric" option is restricted to the gsynth-style regime: it requires time.component.from = "nevertreated", no treatment reversal, and method not in c("mc", "both"). These three conditions correspond to Gates A, B, and C in the three-gate defense system (see ARCHITECTURE.md). For all other settings, vartype = "bootstrap" is recommended.

cl

a string specifying the cluster for cluster bootstrapping.

quantile.CI

a logical flag indicating whether to use quantile confidence intervals when bootstrapping.

nboots

an integer specifying the number of bootstrap runs. Ignored if se=FALSE.

alpha

the significance level for hypothesis tests and confidence intervals. Default 0.05.

parallel

controls which operations run in parallel. Accepted values:

TRUE

Enable parallel computing for both CV and bootstrap (default in fect()).

FALSE

Disable all parallel computing.

"cv"

Enable parallel CV only; bootstrap runs serially.

"boot"

Enable parallel bootstrap only; CV runs serially.

c("cv","boot")

Explicit form of TRUE: parallel for both.

When parallel = TRUE, auto-enable thresholds apply: CV parallelism engages only when Nco * TT exceeds the per-method threshold (ife = 20000, mc = 20000, cfe = 60000). Explicit "cv" overrides the threshold. Nested parallelism (calling fect() from within a future_lapply or foreach %dopar% block) should use parallel = FALSE to avoid deadlock. When using parallel with method = "mc", parallel CV computes all candidate lambda values without early stopping (the serial path uses a break_check short-circuit to skip lambdas with diminishing MSPE returns). This guarantees numerical identity between serial and parallel results but may compute a few extra lambda values compared to the serial path. Use parallel = FALSE to preserve the short-circuit behavior.

cores

an integer indicating the number of cores for parallel computing.

tol

a positive number indicating the tolerance level for EM updates.

max.iteration

the maximal number of iterations for the EM algorithm.

seed

an integer seed for random number generation.

min.T0

an integer specifying the minimum number of pre-treatment periods for each treated unit.

max.missing

an integer specifying the maximum number of missing observations allowed per unit.

proportion

a numeric value specifying which pre-treatment periods are used for goodness-of-fit tests.

pre.periods

a vector specifying the range of pre-treatment periods used for the goodness-of-fit test.

f.threshold

a numeric threshold for an F-test in equivalence testing. Default 0.5.

tost.threshold

a numeric threshold for two-one-sided t-tests.

knots

a numeric vector specifying knots (currently unused; reserved for future use).

degree

an integer specifying the degree (currently unused; reserved for future use).

sfe

vector specifying other fixed effects for method="cfe".

cfe

a vector of lists specifying interactive fixed effects for method="cfe".

Z

a vector specifying the time-invariant covariates for the Z matrix.

gamma

a vector specifying the time-varying covariates for the gamma matrix.

Q

a vector specifying the time-varying covariates for the Q matrix.

kappa

a vector specifying the time-invariant covariates for the kappa matrix.

Q.type

a vector specifying the type of Q matrix.

Q.bspline.degree

an integer specifying the degree used when Q.type includes "bspline" in method="cfe". If NULL, a default degree is chosen based on the number of distinct time values.

Z.param

a list specifying the parameters for the Z matrix.

Q.param

a list specifying the parameters for the Q matrix.

balance.period

a length-2 vector specifying a time range for a balanced sample.

fill.missing

a logical flag indicating whether to allow missing observations in a balanced sample.

placeboTest

a logical flag indicating whether to perform a placebo test.

placebo.period

an integer or 2-element numeric vector specifying pseudo-treatment periods.

carryoverTest

a logical flag for carryover tests.

carryover.period

an integer or 2-element numeric vector specifying pseudo-carryover periods.

carryover.rm

an integer specifying the range of post-treatment periods to treat as carryover.

loo

a logical flag for leave-one-period-out goodness-of-fit tests.

permute

a logical flag indicating whether to run a permutation test.

m

an integer specifying the block length for permutation tests. Default 2.

normalize

a logical flag indicating whether to scale outcome and covariates.

keep.sims

a logical flag indicating whether to save unit-time level bootstrap effects. Default keep.sims = FALSE. If se = FALSE, this argument is ignored.

cm

a logical flag indicating whether to enable causal moderation analysis. When TRUE, the estimator decomposes the treatment effect into effect modification and causal moderation components. Currently available for method = "fe" and method = "ife". Default is FALSE.

loading.bound

a string controlling whether treated-unit factor loadings are bounded inside the convex hull of control loadings. "none" (default) reproduces standard GSC behavior. "simplex" constrains each treated unit's loading to be a non-negative convex combination of control loadings via an entropy-regularized simplex projection, ensuring the imputed counterfactual lies pointwise in the convex hull of factor-implied control outcomes. Currently applies only to method = "ife" or method = "gsynth" (equivalent forms), and requires time.component.from = "nevertreated".

gamma.loading

scalar regularization strength for the "simplex" projection. NULL (default) triggers 5-fold cross-validation over gamma.loading.grid. A numeric value is used directly. Ignored when loading.bound = "none".

gamma.loading.grid

a numeric vector of candidate gamma.loading values for cross-validation. NULL (default) uses 10^seq(-2, 2, length.out = 9). Ignored when loading.bound = "none" or when gamma.loading is supplied.

cv.rule

a string selecting the cross-validation rule for choosing the number of factors r (or matrix-completion penalty lambda). One of:

"1se" (default)

The 1-SE rule (Breiman, Friedman, Olshen and Stone 1984; Hastie, Tibshirani and Friedman 2009, Section 7.10): pick the smallest r whose mean CV criterion is within one fold-SE of the minimum-CV-error r. Biases toward parsimony in a fold-aware way — when CV is precise, it allows larger r; when CV is noisy, it gravitates to simpler models.

"min"

Pick the r that minimizes the mean CV criterion (no tolerance).

"1pct"

Legacy pre-2.3.0 heuristic: pick the smallest r within 1% relative tolerance of the best mean CV criterion. Use this for byte-identical reproducibility of pre-2.3.0 fits.

Ignored when CV = FALSE.

Details

fect implements counterfactual estimators for TSCS data. It first imputes counterfactuals by fitting an outcome model using untreated observations, then estimates the individual treatment effect as the difference between observed and predicted outcomes. Finally, it computes average treatment effects on the treated (ATT) and period-specific ATTs. Placebo and equivalence tests help evaluate identification assumptions.

Value

Y.dat

T-by-N matrix of the outcome variable.

D.dat

T-by-N matrix of the treatment variable.

I.dat

T-by-N matrix of observation indicators (observed/missing).

Y

name of the outcome variable.

D

name of the treatment variable.

X

name of any time-varying covariates.

W

name of the weight variable.

index

name of the unit and time indicators.

force

specified fixed effects option.

T

number of time periods.

N

number of units.

p

number of time-varying observables.

r.cv

number of factors (selected by cross-validation if needed).

lambda.cv

optimal hyper-parameter for matrix completion, if applicable.

beta

coefficients for any covariates in an interactive fixed effects model.

sigma2

mean squared error.

IC

information criterion.

est

results of the fitted model.

MSPE

mean squared prediction error from cross-validation.

CV.out

results of the cross-validation procedure.

niter

number of iterations.

factor

estimated time-varying factors.

lambda

estimated loadings.

lambda.tr

estimated loadings for treated units.

lambda.co

estimated loadings for control units.

mu

estimated grand mean.

xi

estimated time fixed effects.

alpha

estimated unit fixed effects.

alpha.tr

estimated unit fixed effects for treated units.

alpha.co

estimated unit fixed effects for control units.

validX

logical indicating if valid covariates exist.

validF

logical indicating if factors exist.

id

vector of unit IDs.

rawtime

vector of time periods.

obs.missing

matrix indicating missingness patterns.

Y.ct

T-by-N matrix of predicted outcomes under no treatment.

eff

T-by-N matrix of estimated individual treatment effects.

res

residuals for observed values.

eff.pre

effects for treated units in pre-treatment periods.

eff.pre.equiv

pre-treatment effects under baseline (two-way FE) model.

pre.sd

by-period residual standard deviations for pre-treatment ATT.

att.avg

overall average treatment effect on the treated.

att.avg.W

weighted ATT.

att.avg.unit

by-unit average treatment effect on the treated.

time

time index for switch-on treatment effect.

count

count of observations for each switch-on effect time.

att

switch-on treatment effect.

att.on.W

weighted switch-on effect.

time.off

time index for switch-off treatment effect.

att.off

switch-off treatment effect.

att.off.W

weighted switch-off effect.

count.off

count for each switch-off period.

att.placebo

ATT for placebo periods.

att.carryover

ATT for carryover periods.

eff.calendar

ATT by calendar time.

eff.calendar.fit

loess-fitted ATT by calendar time.

N.calandar

number of treated observations each calendar period.

balance.avg.att

ATT for balanced sample.

balance.att

switch-on ATT for balanced sample.

balance.time

time index for balanced sample.

balance.count

count for each time in balanced sample.

balance.att.placebo

ATT for placebo period in balanced sample.

group.att

ATT for different groups.

group.output

list of switch-on treatment effects by group.

est.att.avg

inference for att.avg.

est.att.avg.unit

inference for att.avg.unit.

est.att

inference for att.

est.att.W

inference for weighted att.

est.att.off

inference for switch-off.

est.att.off.W

inference for weighted switch-off.

est.placebo

inference for placebo ATT.

est.carryover

inference for carryover ATT.

est.eff.calendar

inference for eff.calendar.

est.eff.calendar.fit

inference for eff.calendar.fit.

est.balance.att

inference for balanced sample switch-on.

est.balance.avg

inference for balanced sample average ATT.

est.balance.placebo

inference for balanced sample placebo.

est.avg.W

inference for att.avg.W.

est.beta

inference for beta.

est.group.att

inference for group-specific ATT.

est.group.output

inference for group output.

att.avg.boot

bootstrap draws for att.avg.

att.avg.unit.boot

bootstrap draws for att.avg.unit.

att.count.boot

bootstrap draws for count.

att.off.boot

bootstrap draws for att.off.

att.off.count.boot

bootstrap draws for count.off.

att.placebo.boot

bootstrap draws for att.placebo.

att.carryover.boot

bootstrap draws for att.carryover.

balance.att.boot

bootstrap draws for balance.att.

att.bound

equivalence confidence interval for pre-trend.

att.off.bound

equivalence confidence interval for switch-off.

beta.boot

bootstrap draws for beta.

test.out

F-test and equivalence test results for pre-treatment fit.

loo.test.out

leave-one-period-out test results.

permute

permutation test results.

Author(s)

Licheng Liu; Ye Wang; Yiqing Xu; Ziyi Liu

References

Athey, S., Bayati, M., Doudchenko, N., Imbens, G., and Khosravi, K. (2021). Matrix completion methods for causal panel data models. Journal of the American Statistical Association, 116(536), 1716-1730.

Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229-1279.

Liu, L., Wang, Y., and Xu, Y. (2022). A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data. American Journal of Political Science, 68(1), 160-176.

Xu, Y. (2017). Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models. Political Analysis, 25(1), 57-76.

See Also

plot.fect, print.fect

Examples

library(fect)
data(simdata)
out <- fect(Y ~ D + X1 + X2, data = simdata,
            index = c("id","time"), force = "two-way",
            CV = TRUE, r = c(0, 5), se = 0, parallel = FALSE)

fect documentation built on April 30, 2026, 9:06 a.m.