sgf: Sequential g-formula for continuous multiple time point...

View source: R/sgf.R

sgfR Documentation

Sequential g-formula for continuous multiple time point interventions

Description

Estimation of counterfactual outcomes for multiple values of continuous interventions at different time points using the sequential (weighted) g-formula.

Usage

sgf(X, Anodes, Ynodes, Lnodes = NULL, Cnodes = NULL,
    abar = NULL, survivalY = FALSE, 
    SL.library = "SL.glm", SL.export = NULL,
    Yweights = NULL, calc.support = FALSE, B = 0,
    ncores = 1, verbose = TRUE, seed = NULL, prog = NULL, 
    cilevel = 0.95, ...)

Arguments

X

A data frame, following the time-ordering of the variables.

Anodes

A character string of column names in X of the intervention variable(s).

Ynodes

A character string of column names in X of the outcome variable(s).

Lnodes

A character string of column names in X of the time-dependent (post first treatment) variable(s).

Cnodes

A character string of column names in X of the censoring variable(s).

abar

Numeric vector or matrix of intervention values. See Details.

survivalY

Logical. If TRUE, then Y nodes are indicators of an event.

SL.library

Either a character vector of prediction algorithms or a list containing character vectors. See details.

SL.export

A string vector of user-written learning and screening algorithms that are not part of SuperLearner, but are part of the learning library. Only required if ncores>1. See details.

Yweights

A list of length of Ynodes, likely generated with calc.weights.

calc.support

Logical. If TRUE, both crude and conditional support is estimated.

B

An integer specifying the number of bootstrap samples to be used, if any.

ncores

An integer for the number of threads/cores to be used. If >1, parallelization will be utilized.

verbose

Logical. If TRUE, notes and warnings are printed.

seed

An integer specifying the seed to be used to create reproducable results for parallel computing (i.e. when ncores>1).

prog

A character specifying a path where progress should be saved (typically, when ncores>1).

cilevel

Numeric value between 0 and 1 specifying the confidence level. Defaults to 95%.

...

Further arguments to be passed on.

Details

The function calculates the expected counterfactual outcomes (specified under Ynodes) under the intervention abar.

If abar is a vector, then each vector component is used as the intervention value at each time point; that is, interventions which are constant over time are defined. If abar is a matrix (of size 'number interventions' x 'time points'), then each row of the length of Anodes refers to a particular time-varying intervention strategy.

The nested iterated outcome models are fitted using super learning. The specified prediction algorithms (possibly coupled with algorithms for prior variable screening) are passed on to package SuperLearner. See ?SuperLearner for examples of permitted structures. Note: User-written prediction algorithms, corresponding S3 prediction functions and screening algorithms need to be specified under SL.export, if parallelization is used.

For survival settings, it is required that i) survivalY=TRUE and ii) after a Cnode/Ynode is 1, every variable thereafter is set to NA. See manual for an example. The package intervenes on Cnodes, i.e. calculates counterfactual outcomes under no censoring.

If calc.support=TRUE, conditional and crude support measures (i.e., diagnostics) are calculated as described in Section 3.3.2 of Schomaker et al. (2024).

To parallelize computations automatically, it is sufficient to set ncores>1, as appropriate. No further customization or setup is needed, everything will be done by the package. To make estimates under parallelization reproducible, use the seed argument. To watch the progress of parallelized computations, set a path in the prog argument: then, a text file reports on the progress, which is particularly useful if lengthy bootstrapping computations are required.

Value

Returns an object of of class ‘gformula’:

results

matrix of results

diagnostics

list of diagnostics and weights based on the estimated support (if calc.support=TRUE)

SL.weights

matrix of average super learner weights, at each time point

boot.results

matrix of bootstrap results

setup

list of chosen setup parameters

Author(s)

Michael Schomaker

References

Schomaker M, McIlleron H, Denti P, Diaz I. (2024) Causal Inference for Continuous Multiple Time Point Interventions, Statistics in Medicine, 43:5380-5400, see also https://arxiv.org/abs/2305.06645.

See Also

See gformula for parametric g-computation and calc.weights on generating outcome weights.

Examples



data(EFV)
est <- sgf(X=EFV,
                Lnodes  = c("adherence.1","weight.1",
                            "adherence.2","weight.2",
                            "adherence.3","weight.3",
                            "adherence.4","weight.4"
                ),
                Ynodes  = c("VL.0","VL.1","VL.2","VL.3","VL.4"),
                Anodes  = c("efv.0","efv.1","efv.2","efv.3","efv.4"),
                abar=seq(0,5,1)
)

est



CICI documentation built on April 7, 2026, 5:08 p.m.