knitr::opts_chunk$set(echo = TRUE)

By default, the \verb|gfoRmula| package uses a pooled logistic regression model for survival outcomes, logistic regression model for binary end-of-follow-up outcomes, and a linear regression model for continuous end-of-follow-up outcomes. Starting from version 1.1.0, the \verb|gfoRmula| package allows users to apply their own type of outcome models. This document describes how to specify such custom outcome models. This document assumes that readers have read the long-form package documentation of McGrath et al. (2020).

Specifying custom outcome models

To specify custom outcome models, users must provide functions that fit the outcome model and obtain estimates from the fitted model through the parameters \verb|ymodel_fit_custom| and \verb|ymodel_predict_custom|, respectively, in the \verb|gformula| function.

The function for fitting the outcome model must take the parameters \verb|ymodel| and \verb|obs_data|. Below, we illustrate a function for fitting an outcome model using a random forest. This code uses the \verb|randomForest| package.

ymodel_fit_custom <- function(ymodel, obs_data){
  return(randomForest::randomForest(formula = ymodel, data = obs_data))
}

The function for obtaining estimates from the model must take the parameters \verb|fit| (the fitted outcome model) and \verb|newdf| (a \verb|data.table| containing the simulated dataset at time $t$). This function must return the estimated probability of the outcome for survival and binary end-of-follow-up outcomes or the estimated mean of the outcome for continuous end-of-follow-up outcomes in \verb|newdf|. Continuing with the random forest example, the code below obtains the estimated outcome mean for a continuous end-of-follow-up outcome. This code leverages the \verb|predict.randomForest| function in the \verb|randomForest| package.

ymodel_predict_custom <- function(fit, newdf){
  return(as.numeric(predict(object = fit, newdata = newdf)))
}

Example

We perform an analysis similar to that Example 3 in McGrath et al. (2020), except we use the custom outcome model from the previous section.

library('gfoRmula')
library('data.table')
library('Hmisc')
id <- 'id'
time_name <- 't0'
covnames <- c('L1', 'L2', 'A')
outcome_name <- 'Y'
outcome_type <- 'continuous_eof'
covtypes <- c('categorical', 'normal', 'binary')
histories <- c(lagged)
histvars <- list(c('A', 'L1', 'L2'))
covparams <- list(covmodels = c(L1 ~ lag1_A + lag1_L1 + L3 + t0 +
                                  rcspline.eval(lag1_L2, knots = c(-1, 0, 1)),
                                L2 ~ lag1_A + L1 + lag1_L1 + lag1_L2 + L3 + t0,
                                A ~ lag1_A + L1 + L2 + lag1_L1 + lag1_L2 + L3 + t0))
ymodel <- Y ~ A + L1 + L2 + lag1_A + lag1_L1 + lag1_L2 + L3
intervention1.A <- list(static, rep(0, 7))
intervention2.A <- list(static, rep(1, 7))
int_descript <- c('Never treat', 'Always treat')
nsimul <- 10000

gform_cont_eof <- gformula(obs_data = continuous_eofdata,
                           id = id, time_name = time_name,
                           covnames = covnames, outcome_name = outcome_name,
                           outcome_type = outcome_type, covtypes = covtypes,
                           covparams = covparams, ymodel = ymodel,
                           ymodel_fit_custom = ymodel_fit_custom, 
                           ymodel_predict_custom = ymodel_predict_custom,
                           intervention1.A = intervention1.A,
                           intervention2.A = intervention2.A,
                           int_descript = int_descript,
                           histories = histories, histvars = histvars,
                           basecovs = c("L3"), nsimul = nsimul, seed = 1234)
gform_cont_eof

References

McGrath S, Lin V, Zhang Z, Petito LC, Logan RW, HernĂ¡n MA, Young JG. gfoRmula: an R package for estimating the effects of sustained treatment strategies via the parametric g-formula. Patterns. 2020 Jun 12;1(3).



CausalInference/gfoRmula documentation built on Oct. 1, 2024, 8:36 p.m.