fit_iTMLE: Fit the sequential double robust (SDR) procedure, either with...

Description Usage Arguments Value

View source: R/tmle_SDR_iTMLE.R

Description

Interventions on up to 3 nodes are allowed: CENS, TRT and MONITOR. Adjustment will be based on the inverse of the propensity score fits for the observed likelihood (g0.C, g0.A, g0.N), multiplied by the indicator of not being censored and the probability of each intervention in intervened_TRT and intervened_MONITOR. Requires column name(s) that specify the counterfactual node values or the counterfactual probabilities of each node being 1 (for stochastic interventions).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
fit_iTMLE(
  OData,
  tvals,
  Qforms,
  intervened_TRT = NULL,
  intervened_MONITOR = NULL,
  rule_name = paste0(c(intervened_TRT, intervened_MONITOR), collapse = ""),
  models = NULL,
  fit_method = stremrOptions("fit_method"),
  fold_column = stremrOptions("fold_column"),
  stratifyQ_by_rule = FALSE,
  stratify_by_last = TRUE,
  useonly_t_TRT = NULL,
  useonly_t_MONITOR = NULL,
  CVTMLE = FALSE,
  trunc_weights = 10^6,
  weights = NULL,
  parallel = FALSE,
  return_fW = FALSE,
  use_DR_transform = FALSE,
  stabilize = FALSE,
  reg_Q = NULL,
  SDR_model = NULL,
  verbose = getOption("stremr.verbose"),
  ...
)

Arguments

OData

Input data object created by importData function.

tvals

Vector of time-points in the data for which the survival function (and risk) should be estimated

Qforms

Regression formulas, one formula per Q. Only main-terms are allowed.

intervened_TRT

Column name in the input data with the probabilities (or indicators) of counterfactual treatment nodes being equal to 1 at each time point. Leave the argument unspecified (NULL) when not intervening on treatment node(s).

intervened_MONITOR

Column name in the input data with probabilities (or indicators) of counterfactual monitoring nodes being equal to 1 at each time point. Leave the argument unspecified (NULL) when not intervening on the monitoring node(s).

rule_name

Optional name for the treatment/monitoring regimen.

models

Optional parameters specifying the models for fitting the iterative (sequential) G-Computation formula. Must be an object of class ModelStack specified with gridisl::defModel function.

fit_method

Model selection approach. Can be either "none" - no model selection or "cv" - V fold cross-validation that selects the best model according to lowest cross-validated MSE (must specify the column name that contains the fold IDs).

fold_column

The column name in the input data (ordered factor) that contains the fold IDs to be used as part of the validation sample. Use the provided function define_CVfolds to define such folds or define the folds using your own method.

stratifyQ_by_rule

Set to TRUE for stratifying the fit of Q (the outcome model) by rule-followers only. There are two ways to do this stratification. The first option is to use stratify_by_last=TRUE (default), which would fit the outcome model only among the observations that were receiving their supposed counterfactual treatment at the current time-point (ignoring the past history of treatments leading up to time-point t). The second option is to set stratify_by_last=FALSE in which case the outcome model will be fit only among the observations who followed their counterfactual treatment regimen throughout the entire treatment history up to current time-point t (rule followers). For the latter option, the observation would be considered a non-follower if the person's treatment did not match their supposed counterfactual treatment at any time-point up to and including current time-point t.

stratify_by_last

Only used when stratifyQ_by_rule is TRUE. Set to TRUE for stratification by last time-point, set to FALSE for stratification by all time-points (rule-followers). See stratifyQ_by_rule for more details.

useonly_t_TRT

Use for intervening only on some subset of observation and time-specific treatment nodes. Should be a character string with a logical expression that defines the subset of intervention observations. For example, using TRT==0 will intervene only at observations with the value of TRT being equal to zero. The expression can contain any variable name that was defined in the input dataset. Leave as NULL when intervening on all observations/time-points.

useonly_t_MONITOR

Same as useonly_t_TRT, but for monitoring nodes.

CVTMLE

Set to TRUE to run the CV-TMLE algorithm instead of the usual TMLE algorithm. Must set either TMLE=TRUE or iterTMLE=TRUE for this argument to have any effect..

trunc_weights

Specify the numeric weight truncation value. All final weights exceeding the value in trunc_weights will be truncated.

weights

Optional data.table with additional observation- and time-specific weights. Must contain columns ID, t and weight. The column named weight is merged back into the original data according to (ID, t). Not implemented yet.

parallel

Set to TRUE to run the sequential G-COMP or TMLE in parallel (uses foreach with dopar and requires a previously defined parallel back-end cluster)

return_fW

When TRUE, will return the object fit for the last Q regression as part of the output table. Can be used for obtaining subject-specific predictions of the counterfactual functional E(Y_d|W_i).

use_DR_transform

Apply DR transform estimator instead of the iTMLE.

stabilize

Only applies when use_DR_transform=TRUE. Set this argument to TRUE to stabilize the weights by the empirical conditional probability of having followed the rule at time-point t, given the subject has followed the rule all the way up to time-point t.

reg_Q

(ADVANCED USE ONLY) Directly specify the Q regressions, separately for each time-point.

SDR_model

The xgboost parameter settings for iTMLE non-parametric regression targeting. If missing/NULL the default parameter settings will be used.

verbose

Set to TRUE to print auxiliary messages during model fitting.

...

When models arguments is NOT specified, these additional arguments will be passed on directly to all GridSL modeling functions that are called from this routine, e.g., family = "binomial" can be used to specify the model family. Note that all such arguments must be named.

Value

An output list containing the data.table with survival estimates over time saved as "estimates".


osofr/stremr documentation built on Jan. 25, 2022, 8:07 a.m.