causalSamp | R Documentation |
Obtain samples from a causal model using the rejection sampling approach of Evans and Didelez (2024).
causalSamp(
n,
formulas = list(list(z ~ 1), list(x ~ z), list(y ~ x), list(~1)),
pars,
family,
link = NULL,
dat = NULL,
method = "rejection",
control = list(),
seed
)
n |
number of samples required |
formulas |
list of lists of formulas |
pars |
list of lists of parameters |
family |
families for Z,X,Y and copula |
link |
list of link functions |
dat |
data frame of covariates |
method |
only |
control |
list of options for the algorithm |
seed |
random seed used for replication |
Samples from a given causal model using rejection sampling (or, if everything is discrete, direct sampling).
The entries for formula
and family
should each be a
list with four entries, corresponding to the Z
, X
, Y
and
the copula. formula
determines the model, so it is crucial that
every variable to be simulated is represented there exactly once. Each
entry of that list can either be a single formula, or a list of formulae.
Each corresponding entry in family
should be the same length as the
list in formula
or of length 1 (in which case it will be repeated
for all the variables therein).
We use the following codes for different families of distributions: 0 or 5 = binary; 1 = normal; 2 = t-distribution; 3 = gamma; 4 = beta; 6 = log-normal.
The family variables for the copula are also numeric and taken from
VineCopula
.
Use, for example, 1 for Gaussian, 2 for t, 3 for Clayton, 4 for Gumbel,
5 for Frank, 6 for Joe and 11 for FGM copulas.
pars
should be a named list containing: either entries z
,
x
, y
and cop
, or variable names that correspond to the
LHS of formulae in formulas
. Each of these should themselves be a list
containing beta
(a vector of regression parameters) and (possibly)
phi
, a dispersion parameter. For any discrete variable that is a
treatment, you can also specify p
, an initial proportion to simulate
from (otherwise this defaults to 0.5).
Link functions for the Gaussian, t and Gamma distributions can be the identity, inverse or log functions. Gaussian and t-distributions default to the identity, and Gamma to the log link. For the Bernoulli the logit and probit links are available.
Control parameters are oversamp
(default value 10), trace
(default
value 0, increasing to 1 increases verbosity of output),
max_oversamp
(default value 1000), warn
(which currently does
nothing), max_wt
which is set to 1, and increases each time the function
is recalled.
Control parameters also include cop
, which gives a keyword for the
copula that defaults to "cop"
.
This function is kept largely for the replication of simulations from Evans and Didelez (2024).
A data frame containing the simulated data.
Evans, R.J. and Didelez, V. Parameterizing and simulating from causal models (with discussion). Journal of the Royal Statistical Society, Series B, 2024.
pars <- list(z=list(beta=0, phi=1),
x=list(beta=c(0,0.5), phi=1),
y=list(beta=c(0,0.5), phi=0.5),
cop=list(beta=1))
causalSamp(100, pars = pars)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.