wpp: The Witness Protection Program for Causal Effect Estimation
In rbas2015/CausalFX: Methods for Estimating Causal Effects from Observational Data

Description Usage Arguments Details Value References Examples

Perform a search for bounds on the average causal effect (ACE) of a given treatment variable X on a given outcome Y. Bounds are based on finding conditional instrumental variables using the faithfulness assumption relaxed to allow for a moderate degree of unfaithfulness. Candidate models are generated from the method described in covsearch.

1
2
3

wpp(problem, epsilons, max_set = 12, prior_ind = 0.5, prior_table = 10,
  cred_calc = TRUE, M = 1000, analytical_bounds = TRUE,
  pop_solve = FALSE, verbose = FALSE)

`problem`	a `cfx` problem instance for the ACE of a given treatment X on a given outcome Y.
`epsilons`	an array of six positions corresponding to the relaxation parameters. In order: (1) the maximum difference in the conditional probability of the outcome given everything else, as the witness changes levels; (2) the maximum difference in the conditional probability of the outcome given everything else, and the conditional distribution excluding latent variables for the witness set at 0; (3) the maximum difference in the conditional probability of the outcome given everything else, and the conditional distribution excluding latent variables for the witness set at 1; (4) the maximum difference in the conditional probability of the treatment given its causes, and the conditional distribution excluding latent variables (5) the maximum ratio between the conditional distribution of the latent variable given the witness and the marginal distribution of the latent variable. This has to be greater than or equal to 1; (6) the minimum ratio between the conditional distribution of the latent variable given the witness and the marginal distribution of the latent variable. This has to be in the interval (0, 1].
`max_set`	maximum size of conditioning set. The cost of the procedure grows exponentially as a function of this, so be careful when increasing the default value.
`prior_ind`	prior probability of an independence.
`prior_table`	effective sample size hyperparameter of a Dirichlet prior for testing independence with contingency tables.
`cred_calc`	if `TRUE`, compute conditional credible intervals for the ACE of highest scoring model.
`M`	if necessary to compute (conditional) credible intervals, use Monte Carlo with this number of samples.
`analytical_bounds`	if `cred_calc` is `TRUE`, use the analytical method for computing bounds if this is also `TRUE`.
`pop_solve`	if `TRUE`, assume we know the population graph in `problem` instead of data. Notice that data is still used when computing posteriors over bounds.
`verbose`	if `TRUE`, print out more detailed information while running the procedure.

Each pair of witness/admissible set found by covsearch will generate a corresponding lower bound and upper bound. The bounds reported in bounds are based on the posterior expected contingency table implied by prior_table, which uses a numerical method to optimize the bounds. Besides these point estimates, posterior distributions on the lower and upper bound for the highest scoring witness/admissible set can also be computed if the flag cred_calc is set to TRUE, and reported on bounds_post. If the option analytical_bounds is set to FALSE, the posterior distribution calculation will use the numerical method. It provides tighter bounds, but the computational cost is much higher. Please notice these posteriors are for the bounds conditional on the given choice of witness and admissible set: uncertainty on this choice is not taken into account.

A complete explanation of the method is given by Silva and Evans (2014, "Causal inference through a witness protection program", Advances in Neural Information Processing Systems, 27, 298–306).

Note: messages about numerical problems when calling the bound optimizer are not uncommon and are accounted for within the procedure.

An object of class wpp containing the copies of the inputs problem, epsilons, prior_ind, prior_table, analytical_bounds, plus the following fields:

`w_list`	a list of arrays/lists, where each `w_list$witness[i]` is a witness, each `w_list$Z[[i]]` is the corresponding admissible set, and each `w_list$witness_score[i]` is the corresponding score for the witness/admissible set.
`hw`	witness corresponding to the highest scoring pair.
`hZ`	array containing admissible set corresponding to the highest scoring pair.
`bounds`	a two-column matrix where each row corresponds to a different witness/admissible set combination, and the two columns correspond to an estimate of the lower bound and upper bound as given by the posterior expected value given an inferred causal structure.
`bounds_post`	a two-column matrix, where rows correspond to different Monte carlo samples, and the two columns correspond to lower and upper bounds on the ACE as implied by `epsilons` with witness `hw` and admissible set `hZ`.

http://papers.nips.cc/paper/5602-causal-inference-through-a-witness-protection-program

## Generate a synthetic problem
problem <- simulateWitnessModel(p = 4, q = 4, par_max = 3, M = 200)

## Calculate true effect for evaluation purposes
sol_pop <- covsearch(problem, pop_solve = TRUE)
effect_pop <- synthesizeCausalEffect(problem)
cat(sprintf("ACE (true) = %1.2f\n", effect_pop$effect_real))

## WPP search (with a small number of Monte Carlo samples)
epsilons <- c(0.2, 0.2, 0.2, 0.2, 0.95, 1.05)
sol_wpp <- wpp(problem, epsilons, M = 100)
summary(sol_wpp)