omitted_var_dgp | R Documentation |
Takes in a data-generating process (DGP), and induces some bias
due to omitted variable(s). In other words, this function will generate
a design matrix X
and response vector y
according to the inputted
DGP function, but will return a partially missing design matrix, where
some variable/feature columns have been omitted.
omitted_var_dgp(dgp_fun, omitted_vars = 1, ...)
dgp_fun |
A function that generates data according to some known
data-generating process. This function should return an object of the same
format as the output of |
omitted_vars |
A vector of indices or column names corresponding to columns in X that should be omitted. |
... |
Additional arguments to pass to |
The returned object has the same format as the output of
dgp_fun()
, except that specified variables, given by omitted_vars
, have
been omitted from the X
component and the support
(if applicable).
# generate data from a linear gaussian DGP with the first variable missing dgp_out <- omitted_var_dgp(dgp_fun = linear_gaussian_dgp, n = 100, p_obs = 10, s_obs = 2, omitted_vars = 1) # or equivalently, (minus the difference in column names) dgp_out <- linear_gaussian_dgp(n = 10, p_obs = 9, p_unobs = 1, s_obs = 1, s_unobs = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.