tw_data: Generate One-way and Two-way Fixed Effects Panel Data

View source: R/twsim2.R

tw_dataR Documentation

Generate One-way and Two-way Fixed Effects Panel Data

Description

This function will produce panel data where variation can exist in the cross-section, over time or in both dimensions simultaneously. Furthermore, effect heterogeneity by case or cross section is also allowed.

Usage

tw_data(
  N = 30,
  T = 30,
  case.int.mean = 0,
  case.int.sd = 1,
  cross.int.mean = 0,
  cross.int.sd = 1,
  cross.eff.mean = 0,
  did.eff.mean = 0,
  did.eff.sd = 0,
  wid.eff.mean = 0,
  wid.eff.sd = 0,
  cross.eff.sd = 0.5,
  case.eff.mean = 0.5,
  case.eff.sd = 0.5,
  noise.sd = 1,
  omm.x.case = 0,
  omm.x.cross = 0,
  omm.y.case = 0,
  omm.y.cross = 0,
  treat_effect = FALSE,
  binary_outcome = FALSE,
  unbalance = FALSE,
  gsynth = FALSE,
  prop_treated_gsynth = 0.5,
  time.ac = 0,
  spatial.ac = 0
)

Arguments

N

The number of observations for each case/unit.

T

The number of time points per observation.

case.int.mean

The mean of the case/unit intercepts/fixed effects

case.int.sd

The SD of the case/unit intercepts/fixed effects

cross.int.mean

The mean of the cross-sectional intercepts/fixed effects

cross.int.sd

The SD of the cross-sectional intercepts/fixed effects

cross.eff.mean

The mean of the cross-sectional effect of X on Y

did.eff.mean

The mean of the difference-in-difference effect of X on Y

did.eff.sd

The SD of the difference-in-difference effect of X on Y

wid.eff.mean

The mean of the difference-in-cases effect of X on Y

wid.eff.sd

The SD of the difference-in-cases effect of X on Y

cross.eff.sd

The SD of the cross-sectional effect of X on Y

case.eff.mean

The mean of the case (over-time) effect of X on Y

case.eff.sd

The SD of the case (over-time) effect of X on Y

noise.sd

The residual variance of the data

omm.x.case

The value of an omitted variable correlated with X that varies across cases/units

omm.x.cross

The value of an omitted variable correlated with X that varies cross-sectionally

omm.y.case

The value of an omitted variable correlated with Y that varies across cases/units

omm.y.cross

The value of an omitted variable correlated with Y that varies cross-sectionally

treat_effect

Whether to generate an X variable that is 0/1 (dichotomous treatment). If so, all effects (case/time/omitted) should be strictly between [0,.9999] as they will be interpreted as probabilities.

unbalance

Whether to simulate varying numbers of observations by cases or time points.

time.ac

A value between 0 and 1 giving the over-time autocorrelation in effect of X on Y

spatial.ac

A value between 0 and 1 giving the cross-sectional (spatial) autocorrelation in the effect of X on Y

bin_outcome

Whether the Y (outcome) variable should also be simulated as a binary 0/1 variable.

Details

The tw_data function is the workhorse of the twowaysim package. It accepts as input the dimensions of the panel/TSCS data to be generated, and also parameters that determine the extent of variance and heterogeneity in either the cross-sectional or over-time effects in the data. The parameter N determines how many observations exist for each case or unit in the panel, while T determines how many time points exist per case or unit. To create a model with a within-unit over-time (case) effect, simply set case.eff.mean to a non-zero number and set case.eff.sd to zero. Similarly, setting cross.eff.mean to a non-zero number and cross.eff.sd to zero will produce a panel dataset with a cross-sectional effect of X on Y where the effect of X does not vary across countries (no effect heterogeneity). Increasing cross.eff.sd and case.eff.sd will result in more effect heterogeneity across countries and time points. If both case.eff.mean and cross.eff.mean are non-zero, then Y will have both dimensions of variance. A 1-way fixed effects model with intercepts on cases will return the case.eff.mean coefficient and a model with intercepts on time points will return the cross.eff.mean estimate, whereas a 2-way model (intercepts on cases and time points) will return a difficult-to-characterize weighted average.

We refer you to Kropko and Kubinec (2018) for more information on the difference between these models: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3062619.

The parameters case.int and cross.int represent the values of the intercepts for the cases or time points. Changing these parameters will increase or decrease the amount of unexplained variance (random noise) in the dataset.

The additional parameters in the function allow the user to create unbalanced panels (varying numbers of observations per case or time point if unbalance=TRUE), auto-correlation in the effects and omitted variables. Autocorrelation can exist either in the over time dimension or the cross-sectional dimension. To increase time autocorrelation, set time.ac to a value between 0 and 1 where values closer to one signal higher autocorrelation. To increase spatial (cross-sectional) autocorrelation, set spatial.ac to a value between 0 and 1.

Finally, to include omitted variables, set one of the omm parameters to a non-zero value. Omitted variables that vary within cases (over time) can be included by setting an omm parameter subscripted with case to a non-zero value, and the same is possible for variables that vary in the cross-section cross. The analyst can also decide whether the omitted variable is correlated with the independent variable of interest x or the dependent variable y by choosing the subscript of omm.

Value

The function returns a named list where object$data is a data.frame and object$pars are the original parametes used to generate the data.

See Also

tw_model for running linear models on the data and and tw_sim function for running Monte Carlo simulations on panel data.

Examples


# case (over-time) effect with no effect heterogeneity

case1 <- tw_data(case.eff.mean=-1,case.eff.sd=0)

# case (over-time) effect with substantial effect heterogeneity across countries

case2 <- tw_data(case.eff.mean=-1,case.eff.sd=1)

# cross-section effect with no effect heterogeneity

cross1 <- tw_data(cross.eff.mean=-1,cross.eff.sd=0)

# cross-section effect with substantial effect heterogeneity across countries

cross2 <- tw_data(cross.eff.mean=-1,cross.eff.sd=1)

# panel data with a cross-sectional effect of 3 and a case (over-time) effect of -1

both_case_cross <- tw_data(cross.eff.mean=3,
                             case.eff.mean=-1)


saudiwin/twofe_sim documentation built on Feb. 6, 2024, 11:31 a.m.