Gsurv: Estimate (conditional) survival probabilities with...

View source: R/Gsurv.R

GsurvR Documentation

Estimate (conditional) survival probabilities with G-computation transformation

Description

Estimate P(T > t | T > truncation time, covariates available at truncation time) for given t, where T is the time to event, using G-computation transformation. Use a user-specified flexible method to fit survival curves of time to event/censoring at each stage and then use SuperLearner::SuperLearner to regress pseudo-outcome on covariates in order to estimate P(T > t | T > truncation time, covariates available at truncation time).

Usage

Gsurv(
  covariates,
  follow.up.time,
  visit.times,
  tvals = NULL,
  truncation.index = 1,
  id.var,
  time.var,
  event.var,
  event.formula = NULL,
  Q.formula = ~.,
  event.method = c("survSuperLearner", "rfsrc", "ctree", "rpart", "cforest", "coxph",
    "coxtime", "deepsurv", "dnnsurv", "akritas", "survival_forest"),
  event.control = if (event.method != "survSuperLearner") {
     fit_surv_option()
 }
    else {
     fit_surv_option(option = list(event.SL.library = c("survSL.coxph",
    "survSL.weibreg", "survSL.gam", "survSL.rfsrc"), cens.SL.library = c("survSL.coxph",
    "survSL.weibreg", "survSL.gam", "survSL.rfsrc")))
 },
  Q.SuperLearner.control = list(family = gaussian(), SL.library = "SL.lm")
)

Arguments

covariates

a list of data frames of covarates in the order of visit times. Each data frame contains the covariates collected at a visit time. Data frames may have different numbers of variables (may collect different variables at different visit times) and different numbers of individuals (some individuals may have an event or is censored before a later visit time). All data frames must have a common character variable (see id.var) that identifies each individual but no other variables with common names. No missing data is allowed.

follow.up.time

data frame of follow up times, i.e., times to event/censoring. Contains the variable that identifies each individual, the follow up times and an indicator of event/(right-)censoring. Follow up times must be numeric. Indicator of event/censoring should be binary with 0=censored, 1=event.

visit.times

numeric/integer vector of visit times in ascending order. The first visit time is typically the baseline.

tvals

times t for which P(T > t) given covariates are computed (T is the time to event). Default is all unique event times in follow.up.time. Will be sorted in ascending order.

truncation.index

index of the visit time to which left-truncation is applied. The truncation time is visit.times[truncation.index]. Covariates available up to (inclusive) visit.times[truncation.index] are of interest. Default is 1, corresponding to no truncation.

id.var

(character) name of the variable that identifies each individual.

time.var

(character) name of the variable containing follow up times in the data frame follow.up.time.

event.var

(character) name of the variable containing indicator of event/censoring in the data frame follow.up.time.

event.formula

a list of formulas to specify covariates being used when estimating the conditional survival probabilities of time to event at each visit time. The length should be the number of visit times after truncation.index (inclusive). Default is ~ . for all visit times, which includes main effects of all covariates available at each visit time.

Q.formula

formula to specify covariates being used for estimating P(T > t | T > visit.times[truncation.index], covariates available at visit.times[truncation.index]). Set to include intercept only (~ 0 or ~ -1) for marginal survival probability. Default is ~ ., which includes main effects of all available covariates up to (inclusive) the visit.times[truncation.index].

event.method

one of "survSuperLearner", "rfsrc", "ctree", "rpart", "cforest", "coxph", "coxtime", "deepsurv", "dnnsurv", "akritas", "survival_forest". The machine learning method to fit survival curves of time to event in each time window. See the underlying wrappers fit_survSuperLearner, fit_rfsrc, fit_ctree, fit_rpart, fit_cforest, fit_coxph, fit_coxtime, fit_deepsurv, fit_dnnsurv, fit_akritas, fit_survival_forest for more details and the available options. Default is "survSuperLearner", which may perform well with a decent amount of events and censoring but may fail if too few events or too little censoring in one time window.

event.control

a returned value from fit_surv_option. For event.method="survSuperLearner", default is setting library for both event and censoring to be c("survSL.coxph","survSL.weibreg","survSL.gam","survSL.rfsrc").

Q.SuperLearner.control

a list containing optional arguments passed to SuperLearner::SuperLearner. We encourage using a named list. Will be passed to SuperLearner::SuperLearner by running a command like do.call(SuperLearner, Q.SuperLearner.control). Default is list(SL.library="SL.lm"), which uses linear regression. The user should not specify Y and X, and must specify SL.library if default is not used. If family is gaussian by default if unspecified, and must be gaussian if specified, with a possibly non-identity link. When Q.formula only includes an intercept, SuperLearner::SuperLearner will not be called and the default setting can be used.

Value

a list of fitted SuperLearner models corresponding to each t in tvals.

Formula arguments

All formulas should have covariates on the right-hand side and no terms on the left-hand side, e.g., ~ V1 + V2 + V3. At each visit time, the corresponding formulas may (and usually should) contain covariates at previous visit times, and must only include available covariates up to (inclusive) that visit time. Interactions, polynomials and splines may be treated differently by different machine learning methods to estimate conditional survival curves.


QIU-Hongxiang-David/SDRsurv documentation built on March 29, 2024, 8:41 a.m.