causal_XTY_multiple: Simulating a causal data set S = (X,Y_i, T, Y_{obs}) with...
In lamke07/stat545lamke07: Collection of functions to create quick data sets

Description Usage Arguments Value Examples

View source: R/generate_causal.R

Creates a causal data set S = (X, Y_i, T, Y_{obs}) for causal inference. The p columns of X are sampled from an independent Gaussian distribution with mean μ_i with standard deviation σ_i, i.e. N(μ_i, σ_i^2). A treatment T is sampled, where more than 2 treatments are possible. The observations Y_i correspond to the outcome if the treatment i is applied. The outcome Y = X^T β is assumed to depend on X in a linear fashion, and the treatment effect of treatment T = i is additive. See Causality (Pearl 2009) for further details and a general introduction to causal inference.

causal_XTY_multiple(
  n = 100,
  mu = rep(0, 3),
  sigma = rep(1, 3),
  beta_coefficients = 1:3,
  treatment_prob = rep(0.25, 4),
  treatment_effect = c(10, 20, 30, 40)
)

`n`	desired number of data points in the data set.
`mu`	a p-dimensional vector of means for μ.
`sigma`	a p-dimensional vector of non-negative standard deviations for σ.
`beta_coefficients`	a p-dimensional vector of coefficients for β.
`treatment_prob`	a probability vector with weights summing to 1, corresponding to the probability of treatment.
`treatment_effect`	a vector corresponding to the additive treatment effect of each treatment on the outcome Y.

A causal data set S = (X,Y_i, T, Y_{obs}) with multiple potential outcomes. In the default case, the p columns X_i are sampled from N(0,1), with beta-coefficients 1 to 3 for the base outcome Y. We also have n = 100, p = 3, where p corresponds to the number of columns in X. The treatment probabilities are equally likely.

causal_XTY_multiple()

causal_XTY_multiple(n = 40, mu = rep(2, 7), sigma = 1:7,
                    beta_coefficients = 1:7,
                    treatment_prob = c(0.4, 0.1, 0.1, 0.2, 0.2),
                    treatment_effect = 1:5)