simulate_data_factor: Simulate data according to factor model
In pensynth: Penalized Synthetic Control Estimation

simulate_data_factor

R Documentation

Simulate data according to factor model

Description

This function simulates data according to a latent factor model:

Simulate time-varying latent factors, which are the same for all units
Simulate time-invariant factor loadings, separately for each donor unit
Create sparse unit weights for each treated unit
Compute the loadings for the treated units as donor-unit loadings * weights
Simulate observed outcome time-series as factors * loadings + error
Do the same for each covariate, holding loadings equal. Average across pre-intervention timepoints.

Usage

simulate_data_factor(
  N_donor = 50,
  N_treated = 1,
  N_nonzero = 4,
  N_covar = 5,
  N_pre = 12,
  N_post = 6,
  N_factors = 3,
  treatment_effect = 1,
  sd_factors = sqrt(2)/2,
  ar1_factors = 0.8,
  sd_loadings = sqrt(2)/2,
  sd_errors = 0.5,
  covar_means = TRUE
)

Arguments

`N_donor`	number of donors
`N_treated`	number of treated units
`N_nonzero`	number of true nonzero weights
`N_covar`	number of covariates
`N_pre`	number of pre-intervention timepoints
`N_post`	number of post-intervention timepoints
`N_factors`	number of latent factors to simulate
`treatment_effect`	the size of the true treatment effect
`sd_factors`	the standard deviation of the (unit-invariant, time-varying) factors
`ar1_factors`	autoregressive effect of the factors
`sd_loadings`	the standard deviation of the (time-invariant) factor loadings
`sd_errors`	the standard deviation of the independent errors
`covar_means`	whether to average the covariates across the pre-intervention times (experimental)

Details

Note that treatment effect can be a single number, but it may also be a vector of length N_post, indicating the effect size at each post-intervention measurement. occasion. It may also be a matrix of size N_post by N_treated.

Standard values of sd_factors, sd_loadings, and sd_errors have been chosen such that the observed variables have expected variance of 1.

Value

A list with the following elements

W the true unit weights
X0 the donor unit covariates
X1 the treated unit covariates
Z0 the donor unit pre-intervention outcomes
Z1 the treated unit pre-intervention outcomes
Y0 the donor unit post-intervention outcomes
Y1 the treated unit post-intervention outcomes

Examples

# simulate data with an effect of 0.8 SD
dat <- simulate_data_factor(N_treated = 3)

plot(
  NA,
  ylim = c(-5, 5),
  xlim = c(1, 18),
  main = "Simulated data",
  ylab = "Outcome value",
  xlab = "Timepoint"
)
for (n in 1:ncol(dat$Z0))
  lines(1:18, c(dat$Z0[, n], dat$Y0[, n]), col = "grey")
for (n in 1:ncol(dat$Z1)) {
  lines(1:18, c(dat$Z1[, n], dat$Y1[, n]), lwd = 2, col = n)
  lines(1:18, (rbind(dat$Z0, dat$Y0) %*% dat$W)[,n], lty = 2, lwd = 2, col = n)
}

abline(v = nrow(dat$Z1) + 0.5, lty = 3)
legend(
  x = "bottomleft",
  legend = c(
    "Donor units",
    "Treated unit",
    "Synth. control"
  ),
  lty = c(1, 1, 2),
  lwd = c(1, 2, 2),
  col = c("grey", "black", "black")
)
text(nrow(dat$Z1) + 0.5, -5, "Intervention\ntimepoint", pos = 4, font = 3)

pensynth documentation built on May 7, 2026, 9:06 a.m.