generate_data: Generate simulated data that meets the parallel trends...

View source: R/generate_data.R

generate_dataR Documentation

Generate simulated data that meets the parallel trends assumption

Description

Generate simulated data that meets the parallel trends assumption

Usage

generate_data(
  N,
  Tt,
  Beta,
  potential_outcomes = FALSE,
  ylink = "rnorm_identity",
  binomial_n = 1,
  long = FALSE
)

Arguments

N

int. Number of independent observations

Tt

int. Number of periods, minus 1. I.e. there are Tt + 1 periods.

Beta

list of length 4. Output of generate_parameters().

potential_outcomes

logical. Should outcomes and covariates be generated with exposure set to 0 at all times?

ylink

chr. One of "rnorm_identity", "rbinom_logit", or "rbinom_logit_hazard".

binomial_n

int length N. Defaults to all 1's. If ylink is rbinom_logit, you can optionally pass a vector of group sizes to generate aggregate binomial data. In this case, treatments and covariates will be constant at the group level for a given time period.

long

lgl. Should the returned dataset be wide (one row per participant, FALSE), or long (Tt+1 rows per participant, TRUE) ?

Value

Data frame with N rows and (Tt+1)*3 + 2 columns - 'uid' is a unique identifier, 'U0' is an 'unmeasured' baseline covariate, Lt,At,Yt are covariates, exposures, and outcomes, respectively. If binomial_n != 1, an additional column binomial_n is also included.


audreyrenson/didgformula documentation built on Oct. 9, 2022, 11:45 a.m.