generate_data: Generate simulated data that meets the parallel trends...
In audreyrenson/didgformula: The difference-in-differences g-formula

generate_data

R Documentation

Generate simulated data that meets the parallel trends assumption

Description

Generate simulated data that meets the parallel trends assumption

Usage

generate_data(
  N,
  Tt,
  Beta,
  potential_outcomes = FALSE,
  ylink = "rnorm_identity",
  binomial_n = 1,
  long = FALSE
)

Arguments

`N`	int. Number of independent observations
`Tt`	int. Number of periods, minus 1. I.e. there are Tt + 1 periods.
`Beta`	list of length 4. Output of generate_parameters().
`potential_outcomes`	logical. Should outcomes and covariates be generated with exposure set to 0 at all times?
`ylink`	chr. One of "rnorm_identity", "rbinom_logit", or "rbinom_logit_hazard".
`binomial_n`	int length N. Defaults to all 1's. If ylink is rbinom_logit, you can optionally pass a vector of group sizes to generate aggregate binomial data. In this case, treatments and covariates will be constant at the group level for a given time period.
`long`	lgl. Should the returned dataset be wide (one row per participant, FALSE), or long (Tt+1 rows per participant, TRUE) ?

Value

Data frame with N rows and (Tt+1)*3 + 2 columns - 'uid' is a unique identifier, 'U0' is an 'unmeasured' baseline covariate, Lt,At,Yt are covariates, exposures, and outcomes, respectively. If binomial_n != 1, an additional column binomial_n is also included.

audreyrenson/didgformula documentation built on Oct. 9, 2022, 11:45 a.m.