simdata: Simulated panel data with two latent factors
In fect: Fixed Effects Counterfactual Estimators

simdata

R Documentation

Simulated panel data with two latent factors

Description

A simulated panel dataset with continuous outcomes used throughout the package vignettes to demonstrate factor-augmented counterfactual estimators. The data-generating process follows Liu, Wang, and Xu (2024) with one modification (see Format).

The panel has N = 200 units and T = 35 time periods. Treatment switches on and off over time (99 of 150 treated units experience at least one reversal), reflecting a general treatment pattern rather than simple staggered adoption. The outcome includes two latent factors (r = 2), so the parallel-trends assumption is violated and the standard fixed-effects estimator is biased. Treatment assignment loads on the same factors and fixed effects that enter the outcome—units with larger \lambda_i and \alpha_i are more likely to be treated—so the confounding is structural and cannot be removed by two-way fixed effects alone.

Format

A data frame with the following columns:

id: unit identifier (1–200)
time: time period (1–35)
Y: observed outcome
error: idiosyncratic error \varepsilon_{it} \sim N(0, 2)
eff: realized treatment effect \tau_{it}
tr_cum, tr_prob: treatment-probability constructions
D: treatment indicator
X1, X2: observed time-varying covariates \sim N(0, 1) with coefficients 1 and 3
alpha: unit fixed effect \alpha_i \sim N(0, 1)
xi: time fixed effect \xi_t (AR(1) with drift)
F1, F2: latent time factors f_t \in \mathbb{R}^2 (one trending, one white noise)
L1, L2: unit-specific factor loadings \lambda_i \sim N(0.5, 1)
FL1, FL2: per-cell factor-loading products \lambda_{i,k} \cdot f_{t,k} (k = 1, 2)

The DGP is

Y_{it} = \tau_{it} D_{it} + X_{1,it} + 3 X_{2,it} + \mu + 3\alpha_i + \xi_t + 2\, \lambda_i' f_t + \varepsilon_{it},

with grand mean \mu = 5 and treatment effect \tau_{it} \sim N(0.4 \cdot \mathrm{tr\_cum}_{it}/T,\; 0.2).

The 2\, \lambda_i' f_t term doubles the latent factor contribution relative to the original Liu, Wang, and Xu (2024) DGP. The doubling strengthens the factor signal-to-noise ratio (variance of the factor contribution to variance of the residual) from approximately 2.7 to 10.9, which makes the factor structure clearly recoverable by cross-validated rank-selection procedures on this dataset. The unmodified DGP is preserved in earlier package versions; see git log data/simdata.rda for the prior file.

References

Liu, L., Wang, Y., and Xu, Y. (2024). A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data. American Journal of Political Science, 68(1), 160–176.

fect documentation built on May 31, 2026, 1:06 a.m.