causalExp: Simulate a Causal Experiment

Description Usage Arguments Details Value References See Also Examples

Description

simulate_correlation_matrix uses the C-vine method for simulating correlation matrices. (Refer to the referenced paper for details.)

simulate_causal_experiment simulates an RCT or observational data for causal effect estimation. It is mainly used to test different heterogenuous treatment effect estimation strategies.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
simulate_correlation_matrix(dim, alpha)

simulate_causal_experiment(ntrain = nrow(given_features),
  ntest = nrow(given_features), dim = ncol(given_features),
  alpha = 0.1, feat_distribution = "normal", given_features = NULL,
  pscore = "rct5", mu0 = "sparseLinearStrong",
  tau = "sparseLinearWeak", testseed = NULL, trainseed = NULL)

pscores.simulate_causal_experiment

mu0.simulate_causal_experiment

Arguments

dim

Dimension of the data set.

alpha

Only used if given_features is not set and feat_distribution is chosen to be normal. It specifies how correlated the features can be. If alpha = 0, then the features are independent. If alpha is very large, then the features can be very correlated. Use the simulate_correlation_matrix function to get a better understanding of the impact of alpha.

ntrain

Number of training examples.

ntest

Number of test examples.

feat_distribution

Only used if given_features is not specified. Either "normal" or "unif." It specifies the distribution of the features.

given_features

This is used if we already have features and want to test the performance of different estimators for a particular set of features.

pscore, mu0, tau

Parameters that determine the propensity score, the response function for the control units, and tau, respectively. The different options can be seen using names(pscores.simulate_causal_experiment), names(mu0.simulate_causal_experiment), and names(tau.simulate_causal_experiment). This is implemented in this manner, because it enables the user to easily loop through the different estimators.

testseed

The seed used to generate the test data. If NULL, then the seed of the main session is used.

trainseed

The seed used to generate the training data. If NULL, then the seed of the main session is used.

Details

The function simulates causal experiments by generating the features, treatment assignment, observed Y values, and CATE for a test set and a training set. pscore, mu0, and tau define the response functions and the propensity score. For example, pscore = "osSparse1Linear" specifies that

e(x) = max(0.05, min(.95, x1 / 2 + 1 / 4))

and mu0 ="sparseLinearWeak" specifies that the response function for the control units is given by the simple linear function,

mu0(x) = 3 x1 + 5 x2.

Value

A correlation matrix.

A list with the following elements:

setup_name

Name of the setup.

m_t_truth

Function containing the response function of the treated units.

m_c_truth

Function containing the response function of the control units.

propscore

Propensity score function.

alpha

Chosen alpha.

feat_te

Data.frame containing the features of the test samples.

W_te

Numeric vector containing the treatment assignment of the test samples.

tau_te

Numeric vector containing the true conditional average treatment effects of the test samples.

Yobs_te

Numeric vector containing the observed Y values of the test samples.

feat_tr

Data.frame containing the features of the training samples.

W_tr

Numeric vector containing the treatment assignment of the training samples.

tau_tr

Numeric vector containing the true conditional average treatment effects of the training samples.

Yobs_tr

Numeric vector containing the observed Y values of the training samples.

References

See Also

X-Learner

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
require(causalToolbox)

ce_sim <- simulate_causal_experiment(
  ntrain = 20,
  ntest = 20,
  dim = 7
)

ce_sim

## Not run: 
estimators <- list(
  S_RF = S_RF, 
  T_RF = T_RF, 
  X_RF = X_RF, 
  S_BART = S_BART,
  T_BART = T_BART, 
  X_BARTT = X_BART)

performance <- data.frame()
for(tau_n in names(tau.simulate_causal_experiment)){
  for(mu0_n in names(mu0.simulate_causal_experiment)) {
    ce <- simulate_causal_experiment(
      given_features = iris,
      pscore = "rct5",
      mu0 = mu0_n,
      tau = tau_n)
    
    for(estimator_n in names(estimators)) {
      print(paste(tau_n, mu0_n, estimator_n))
    
      trained_e <- estimators[[estimator_n]](ce$feat_tr, ce$W_tr, ce$Yobs_tr)
      performance <- 
        rbind(performance, 
              data.frame(
                mu0 = mu0_n,
                tau = tau_n,
                estimator = estimator_n,
                MSE = mean((EstimateCate(trained_e, ce$feat_te) - 
                            ce$tau_te)^2)))
    }
  }
}

reshape2::dcast(data = performance, mu0 + tau ~ estimator)

## End(Not run)

soerenkuenzel/causalToolbox documentation built on April 28, 2021, 5:19 a.m.