simulateWitnessModel: Generates Synthetic CausalFX Problems

Description Usage Arguments Details Value

View source: R/synthetize.R


This function generates simple synthetic problems that can be used to test the methods in the CausalFX package. CausalFX problems are objects of class cfx, and specify a causal inference task of estimating the effect of a given treatment X on a given outcome Y, with a corresponding dataset. This function generates only binary data from a multinomial distribution.


simulateWitnessModel(p, q, par_max, M, no_sol = FALSE)



number of background variables (besides X and Y).


number of sink variables.


maximum number of parents in the background set.


sample size.


if TRUE, then latent variables are parents of both X and Y, meaning no adjustment set will theoretically be found (barring sampling variability) if a method such as covsearch is applied.


The function first generates a directed acyclic graph with a given number of variables which have no latent common parents with treatment X and outcome Y, which we call "background variables". Conditioning on a subset of the background variables will block all measured confounding in this problem. The function then generates a set of "sink" variables K which have one common latent parent with either X or Y, but are otherwise not adjacent to any observed variable. Conditioning on the sink variables will generate confounding paths between treatment and outcome. Latent variables are a pool of independent variables with no parents. If no_sol is FALSE, they are parents of either X or Y but not both. If no_sol is TRUE, then all latent variables are parents of both X and Y and as such no adjustment set with observed variables will remove unmeasured confounding between treatment and outcome. Remaining parents for observed variables are sampled uniformly at random from the pool of background variables obeying the constraint on the maximum number of parents given by par_max.

Given a graph structure, each variable i is given a binary conditional distribution, defining the probability of i being equal to 1 given its parents in the graph. This conditional distribution is generated randomly by a logistic regression model with pairwise interactions, where coefficients are generated by samples from independent Gaussians with zero mean and standard deviation 10 / number of parents.


An object of class cfx.

CausalFX documentation built on May 29, 2017, 6:34 p.m.