estimateHazards: Estimation for the Method of Cause-Specific Hazards

View source: R/hazards_estimate.R

estimateHazardsR Documentation

Estimation for the Method of Cause-Specific Hazards

Description

This function computes an estimate of the cause-specific hazard functions over all times using either glm or SuperLearner. The structure of the function is specific to how it is called within hazard_tmle. In particular, dataList must have a very specific structure for this function to run properly. The list should consist of data.frame objects. The first will have the number of rows for each observation equal to the ftime corresponding to that observation. Subsequent entries will have t0 rows for each observation and will set the trt column equal to each value of trtOfInterest in turn. The function uses the first entry in dataList to iteratively fit hazard regression models for each cause of failure. Thus, this data.frame needs to have a column called Nj for each value of j in J. The first fit estimates the hazard of min(J), while subsequent fits estimate the pseudo-hazard of all other values of j, where pseudo-hazard is used to mean the probability of a failure due to type j at a particular timepoint given no failure of any type at any previous timepoint AND no failure due to type k < j at a particular timepoint. The hazard estimates of causes j' can then be used to map this pseudo-hazard back into the hazard at a particular time. This is nothing more than the re-framing of a conditional multinomial probability into a series of conditional binomial probabilities. This structure ensures that no strata have estimated hazards that sum to more than one over all possible causes of failure at a particular timepoint.

Usage

estimateHazards(
  dataList,
  J,
  adjustVars,
  SL.ftime = NULL,
  glm.ftime = NULL,
  glm.family,
  cvControl,
  returnModels,
  bounds,
  verbose,
  trtOfInterest,
  stratify,
  ...
)

Arguments

dataList

A list of data.frame objects.

J

Numeric vector indicating the labels of all causes of failure.

adjustVars

Object of class data.frame containing the variables to adjust for in the regression.

SL.ftime

A character vector or list specification to be passed to the SL.library argument of SuperLearner for the outcome regression (either cause-specific hazards or conditional mean). See the documentation of SuperLearner for more information on how to specify valid SuperLearner libraries. It is expected that the wrappers used in the library will play nicely with the input variables, which will be called "trt" and names(adjustVars).

glm.ftime

A character specification of the right-hand side of the equation passed to the formula option of a call to glm for the outcome regression (either using cause-specific hazards or conditional mean). Ignored if SL.ftime != NULL. Use "trt" to specify the treatment in this formula (see examples). The formula can additionally include any variables found in names(adjustVars).

glm.family

The type of regression to be performed if fitting GLMs in the estimation and fluctuation procedures. The default is "binomial" for logistic regression. Only change this from the default if there are justifications that are well understood. This is inherited from the calling function (either mean_tmle or hazard_tmle).

cvControl

A list providing control options to be fed directly into calls to SuperLearner. This should match the contents of SuperLearner.CV.control exactly. For details, consult the documentation of the SuperLearner package. This is passed in from mean_tmle or hazard_tmle via survtmle.

returnModels

A logical indicating whether to return the glm or SuperLearner objects used to estimate the nuisance parameters. Must be set to TRUE to make downstream calls to timepoints for obtaining estimates at times other than t0. See documentation of timepoints for more information.

bounds

A list of bounds... TODO: Add more description here.

verbose

A logical indicating whether the function should print messages to indicate progress.

trtOfInterest

An input specifying which levels of trt are of interest. The default value computes estimates for all values in unique(trt). Can alternatively be set to a vector of values found in trt. Ignored unless stratify == TRUE.

stratify

If TRUE, then the hazard model is estimated using only the observations with trt == trtOfInterest. Only works if length(trtOfInterest) == 1. If stratify = TRUE then glm.ftime cannot include trt in the model formula and any learners in SL.ftime should not assume a variable named trt will be included in the candidate super learner estimators.

...

Other arguments. Not currently used.

Value

The function returns a list that is exactly the same as the input dataList, but with additional columns corresponding to the hazard pseudo-hazard, and the total hazard summed over all causes k < j.


benkeser/survtmle documentation built on Nov. 23, 2023, 4:45 a.m.