rfrugal: Sample from a causal model

rfrugalR Documentation

Sample from a causal model

Description

Obtain samples from a causal model parameterized as in Evans and Didelez (2024).

Usage

rfrugal(n, causl_model, control = list())

rfrugalParam(
  n,
  formulas = list(list(z ~ 1), list(x ~ z), list(y ~ x), list(~1)),
  family = c(1, 1, 1, 1),
  pars,
  link = NULL,
  dat = NULL,
  method = "inversion",
  control = list(),
  ...
)

Arguments

n

number of samples required

causl_model

object of class causl_model

control

list of options for the algorithm

formulas

list of lists of formulas

family

families for variables and copula

pars

list of lists of parameters

link

list of link functions

dat

optional data frame of covariates

method

either "inversion" (the default), "inversion_mv", or "rejection"

...

other arguments, such as custom families

estimand

quantity to control, default is "ATE"

Details

Samples from a given causal model under the frugal parameterization.

The entries for formula and family should each be a list with four entries, corresponding to the Z, X, Y and the copula. formula determines the model, so it is crucial that every variable to be simulated is represented there exactly once. Each entry of that list can either be a single formula, or a list of formulae. Each corresponding entry in family should be the same length as the list in formula or of length 1 (in which case it will be repeated for all the variables therein).

We use the following codes for different families of distributions:

val family
0 binomial
1 gaussian
2 t
3 Gamma
4 beta
5 binomial
6 lognormal
11 ordinal
10 categorical

The family variables for the copula are also numeric and taken from VineCopula. Use, for example, 1 for Gaussian, 2 for t, 3 for Clayton, 4 for Gumbel, 5 for Frank, 6 for Joe and 11 for FGM copulas.

pars should be a named list containing variable names that correspond to the LHS of formulae in formulas. Each of these should themselves be a list containing beta (a vector of regression parameters) and (possibly) phi, a dispersion parameter. For any discrete variable that is a treatment, you can also specify p, an initial proportion to simulate from (otherwise this defaults to 0.5).

Link functions for the Gaussian, t and Gamma distributions can be the identity, inverse or log functions. Gaussian and t-distributions default to the identity, and Gamma to the log link. For the Bernoulli the logit, probit, and log links are available.

A variety of sampling methods are implemented. The inversion method with pair-copulas is the default (method="inversion"), but we cam also use a multivariate copula (method="inversion_mv") or even rejection sampling (method="rejection"). Note that the inveresion_mv method simulates the entire copula, so it cannot depend upon intermediate variables.

The only control parameters are cop: which gives a keyword for the copula that defaults to "cop"; quiet which defaults to FALSE but will reduce output if set to TRUE; and (if rejection sampling is selected) careful: this logical enables one to implement the full rejection sampling method, which means we do get exact samples (note this method is generally very slow, especially if we have an outlying value, so the default is FALSE).

Value

A data frame containing the simulated data.

Functions

  • rfrugalParam(): old function for simulation

Examples

pars <- list(z=list(beta=0, phi=1),
             x=list(beta=c(0,0.5), phi=1),
             y=list(beta=c(0,0.5), phi=0.5),
             cop=list(beta=1))
rfrugalParam(100, pars = pars)



rje42/causl documentation built on June 1, 2025, 2:50 p.m.