make_ssm_data | R Documentation |
The data generating process is defined as:
make_ssm_data(
n_obs = 8000,
dim_x = 100,
theta = 1,
mar = TRUE,
return_type = "DoubleMLData"
)
n_obs |
( |
dim_x |
( |
theta |
( |
mar |
( |
return_type |
( |
y_i = \theta d_i + x_i' \beta + u_i,
s_i = 1\lbrace d_i + \gamma z_i + x_i' \beta + v_i > 0 \rbrace,
d_i = 1\lbrace x_i' \beta + w_i > 0 \rbrace,
with y_i
being observed if s_i = 1
and covariates x_i \sim \mathcal{N}(0, \Sigma^2_x)
, where
\Sigma^2_x
is a matrix with entries
\Sigma_{kj} = 0.5^{|j-k|}
.
\beta
is a dim_x
-vector with entries \beta_j=\frac{0.4}{j^2}
z_i \sim \mathcal{N}(0, 1)
,
(u_i,v_i) \sim \mathcal{N}(0, \Sigma^2_{u,v})
,
w_i \sim \mathcal{N}(0, 1)
.
The data generating process is inspired by a process used in the simulation study (see Appendix E) of Bia, Huber and Lafférs (2023).
Depending on the return_type
, returns an object or set of objects as specified.
Michela Bia, Martin Huber & Lukáš Lafférs (2023) Double Machine Learning for Sample Selection Models, Journal of Business & Economic Statistics, DOI: 10.1080/07350015.2023.2271071
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.