tw_data | R Documentation |
This function will produce panel data where variation can exist in the cross-section, over time or in both dimensions simultaneously. Furthermore, effect heterogeneity by case or cross section is also allowed.
tw_data(
N = 30,
T = 30,
case.int.mean = 0,
case.int.sd = 1,
cross.int.mean = 0,
cross.int.sd = 1,
cross.eff.mean = 0,
did.eff.mean = 0,
did.eff.sd = 0,
wid.eff.mean = 0,
wid.eff.sd = 0,
cross.eff.sd = 0.5,
case.eff.mean = 0.5,
case.eff.sd = 0.5,
noise.sd = 1,
omm.x.case = 0,
omm.x.cross = 0,
omm.y.case = 0,
omm.y.cross = 0,
treat_effect = FALSE,
binary_outcome = FALSE,
unbalance = FALSE,
gsynth = FALSE,
prop_treated_gsynth = 0.5,
time.ac = 0,
spatial.ac = 0
)
N |
The number of observations for each case/unit. |
T |
The number of time points per observation. |
case.int.mean |
The mean of the case/unit intercepts/fixed effects |
case.int.sd |
The SD of the case/unit intercepts/fixed effects |
cross.int.mean |
The mean of the cross-sectional intercepts/fixed effects |
cross.int.sd |
The SD of the cross-sectional intercepts/fixed effects |
cross.eff.mean |
The mean of the cross-sectional effect of X on Y |
did.eff.mean |
The mean of the difference-in-difference effect of X on Y |
did.eff.sd |
The SD of the difference-in-difference effect of X on Y |
wid.eff.mean |
The mean of the difference-in-cases effect of X on Y |
wid.eff.sd |
The SD of the difference-in-cases effect of X on Y |
cross.eff.sd |
The SD of the cross-sectional effect of X on Y |
case.eff.mean |
The mean of the case (over-time) effect of X on Y |
case.eff.sd |
The SD of the case (over-time) effect of X on Y |
noise.sd |
The residual variance of the data |
omm.x.case |
The value of an omitted variable correlated with X that varies across cases/units |
omm.x.cross |
The value of an omitted variable correlated with X that varies cross-sectionally |
omm.y.case |
The value of an omitted variable correlated with Y that varies across cases/units |
omm.y.cross |
The value of an omitted variable correlated with Y that varies cross-sectionally |
treat_effect |
Whether to generate an X variable that is 0/1 (dichotomous treatment). If so,
all effects (case/time/omitted) should be strictly between |
unbalance |
Whether to simulate varying numbers of observations by cases or time points. |
time.ac |
A value between 0 and 1 giving the over-time autocorrelation in effect of X on Y |
spatial.ac |
A value between 0 and 1 giving the cross-sectional (spatial) autocorrelation in the effect of X on Y |
bin_outcome |
Whether the Y (outcome) variable should also be simulated as a binary 0/1 variable. |
The tw_data
function is the workhorse of the twowaysim
package. It accepts as
input the dimensions of the panel/TSCS data to be generated, and also parameters that
determine the extent of variance and heterogeneity in either the cross-sectional or
over-time effects in the data. The parameter N
determines how many observations
exist for each case or unit in the panel, while T
determines how many time points exist
per case or unit. To create a model with a within-unit over-time (case) effect,
simply set case.eff.mean
to a non-zero number and set case.eff.sd
to zero. Similarly,
setting cross.eff.mean
to a non-zero number and cross.eff.sd
to zero will produce a
panel dataset with a cross-sectional effect of X on Y where the effect of X does not vary across
countries (no effect heterogeneity). Increasing cross.eff.sd
and case.eff.sd
will result in more
effect heterogeneity across countries and time points. If both case.eff.mean
and cross.eff.mean
are
non-zero, then Y will have both dimensions of variance. A 1-way fixed effects model with intercepts on
cases will return the case.eff.mean
coefficient and a model with intercepts on time points will return
the cross.eff.mean
estimate, whereas a 2-way model (intercepts on cases and time points) will return
a difficult-to-characterize weighted average.
We refer you to Kropko and Kubinec (2018) for more information on the difference between these models: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3062619.
The parameters case.int
and cross.int
represent the values of the intercepts for the
cases or time points. Changing these parameters will increase or decrease the amount of unexplained
variance (random noise) in the dataset.
The additional parameters in the function allow the user to create unbalanced panels (varying numbers
of observations per case or time point if unbalance=TRUE
), auto-correlation in the effects and
omitted variables. Autocorrelation can exist either in the over time dimension or the cross-sectional
dimension. To increase time autocorrelation, set time.ac
to a value between 0 and 1 where values
closer to one signal higher autocorrelation. To increase spatial (cross-sectional) autocorrelation,
set spatial.ac
to a value between 0 and 1.
Finally, to include omitted variables, set one of the omm
parameters to a non-zero value.
Omitted variables that vary within cases (over time) can be included by setting an omm
parameter
subscripted with case
to a non-zero value, and the same is possible for variables that vary
in the cross-section cross
. The analyst can also decide whether the omitted variable is correlated
with the independent variable of interest x
or the dependent variable y
by choosing the
subscript of omm
.
The function returns a named list where object$data
is a data.frame
and
object$pars
are the original parametes used to generate the data.
tw_model
for running linear models on the data and and
tw_sim
function for running Monte Carlo simulations on panel data.
# case (over-time) effect with no effect heterogeneity
case1 <- tw_data(case.eff.mean=-1,case.eff.sd=0)
# case (over-time) effect with substantial effect heterogeneity across countries
case2 <- tw_data(case.eff.mean=-1,case.eff.sd=1)
# cross-section effect with no effect heterogeneity
cross1 <- tw_data(cross.eff.mean=-1,cross.eff.sd=0)
# cross-section effect with substantial effect heterogeneity across countries
cross2 <- tw_data(cross.eff.mean=-1,cross.eff.sd=1)
# panel data with a cross-sectional effect of 3 and a case (over-time) effect of -1
both_case_cross <- tw_data(cross.eff.mean=3,
case.eff.mean=-1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.