simulate_data_factor: Simulate data according to factor model

View source: R/simulate_data.R

simulate_data_factorR Documentation

Simulate data according to factor model

Description

This function simulates data according to a latent factor model:

  1. Simulate time-varying latent factors, which are the same for all units

  2. Simulate time-invariant factor loadings, separately for each donor unit

  3. Create sparse unit weights for each treated unit

  4. Compute the loadings for the treated units as donor-unit loadings * weights

  5. Simulate observed outcome time-series as factors * loadings + error

  6. Do the same for each covariate, holding loadings equal. Average across pre-intervention timepoints.

Usage

simulate_data_factor(
  N_donor = 50,
  N_treated = 1,
  N_nonzero = 4,
  N_covar = 5,
  N_pre = 12,
  N_post = 6,
  N_factors = 3,
  treatment_effect = 1,
  sd_factors = sqrt(2)/2,
  ar1_factors = 0.8,
  sd_loadings = sqrt(2)/2,
  sd_errors = 0.5,
  covar_means = TRUE
)

Arguments

N_donor

number of donors

N_treated

number of treated units

N_nonzero

number of true nonzero weights

N_covar

number of covariates

N_pre

number of pre-intervention timepoints

N_post

number of post-intervention timepoints

N_factors

number of latent factors to simulate

treatment_effect

the size of the true treatment effect

sd_factors

the standard deviation of the (unit-invariant, time-varying) factors

ar1_factors

autoregressive effect of the factors

sd_loadings

the standard deviation of the (time-invariant) factor loadings

sd_errors

the standard deviation of the independent errors

covar_means

whether to average the covariates across the pre-intervention times (experimental)

Details

Note that treatment effect can be a single number, but it may also be a vector of length N_post, indicating the effect size at each post-intervention measurement. occasion. It may also be a matrix of size N_post by N_treated.

Standard values of sd_factors, sd_loadings, and sd_errors have been chosen such that the observed variables have expected variance of 1.

Value

A list with the following elements

  • W the true unit weights

  • X0 the donor unit covariates

  • X1 the treated unit covariates

  • Z0 the donor unit pre-intervention outcomes

  • Z1 the treated unit pre-intervention outcomes

  • Y0 the donor unit post-intervention outcomes

  • Y1 the treated unit post-intervention outcomes

See Also

pensynth(), cv_pensynth(), placebo_test(), simulate_data_synth()

Examples

# simulate data with an effect of 0.8 SD
dat <- simulate_data_factor(N_treated = 3)

plot(
  NA,
  ylim = c(-5, 5),
  xlim = c(1, 18),
  main = "Simulated data",
  ylab = "Outcome value",
  xlab = "Timepoint"
)
for (n in 1:ncol(dat$Z0))
  lines(1:18, c(dat$Z0[, n], dat$Y0[, n]), col = "grey")
for (n in 1:ncol(dat$Z1)) {
  lines(1:18, c(dat$Z1[, n], dat$Y1[, n]), lwd = 2, col = n)
  lines(1:18, (rbind(dat$Z0, dat$Y0) %*% dat$W)[,n], lty = 2, lwd = 2, col = n)
}

abline(v = nrow(dat$Z1) + 0.5, lty = 3)
legend(
  x = "bottomleft",
  legend = c(
    "Donor units",
    "Treated unit",
    "Synth. control"
  ),
  lty = c(1, 1, 2),
  lwd = c(1, 2, 2),
  col = c("grey", "black", "black")
)
text(nrow(dat$Z1) + 0.5, -5, "Intervention\ntimepoint", pos = 4, font = 3)

pensynth documentation built on May 7, 2026, 9:06 a.m.