fitPropensity: Define and fit propensity score models.
In osofr/stremr: Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

Description Usage Arguments Value Examples

Defines and fits estimators for the propensity scores, separately for censoring, treatment and monitoring events. When there is right-censoring and/or not intervening on monitoring, only the propensity score model for treatment will be estimated.

fitPropensity(
  OData,
  gform_CENS,
  gform_TRT,
  gform_MONITOR,
  stratify_CENS = NULL,
  stratify_TRT = NULL,
  stratify_MONITOR = NULL,
  models_CENS = NULL,
  models_TRT = NULL,
  models_MONITOR = NULL,
  fit_method = stremrOptions("fit_method"),
  fold_column = stremrOptions("fold_column"),
  reg_CENS,
  reg_TRT,
  reg_MONITOR,
  use_weights = FALSE,
  verbose = getOption("stremr.verbose"),
  ...
)

`OData`	Input data object created by `importData` function.
`gform_CENS`	Specify the regression formula for the right-censoring mechanism, in the format "CensVar1 + CensVar2 ~ Predictor1 + Predictor2". Leave as missing for data with no right-censoring.
`gform_TRT`	Specify the regression formula for the treatment mechanism, in the format "TRTVar1 + TRTVar2 ~ Predictor1 + Predictor2".
`gform_MONITOR`	Specify the regression formula for the treatment mechanism, in the format "TRTVar1 + TRTVar2 ~ Predictor1 + Predictor2". Leave as missing for data with no monitoring events or when not intervening on monitoring.
`stratify_CENS`	Define strata(s) for each censoring variable from `gform_CENS`. Must be named list containing the logical expressions (the logical expressions must be provided as character strings). When missing (default), all censoring models will be fit by pooling all available observations, across all time-points. When used the censoring models in `gform_CENS` will be trained separately on each strata (defined by separate logical expressions). If the `gform_CENS` contains more than one censoring variable then this argumement (`stratify_CENS`) must provide separate stratas for each censoring variable or be left as missing. For example, when `gform_CENS`="CensVar1 + CensVar2 ~ Predictor1 + Predictor2", this argument should be a list of length two, with list items named as "CensVar1" and "CensVar2". The expressions in stratify_CENS[["CensVar1"]] define the training stratas for censoring variable `CensVar1`, while the expressions in stratify_CENS[["CensVar1"]] define the training stratas for `CensVar2`. See additional examples below.
`stratify_TRT`	Define strata(s) for treatment model(s). Must be a list of logical expressions (input the expression as character strings). When missing (default), the treatment model(s) are fit by pooling all available (uncensored) observations, across all time-points. The rules are the same as for `stratify_CENS`.
`stratify_MONITOR`	Define strata(s) for monitoring model(s). Must be a list of logical expressions (input the expression as character strings). When missing (default), the monitoring model is fit by pooling all available (uncensored) observations, across all time-points. The rules are the same as for `stratify_CENS`.
`models_CENS`	Optional parameter specifying the models for fitting the censoring mechanism(s) with `gridisl` R package. Must be an object of class `ModelStack` specified with `gridisl::defModel` function.
`models_TRT`	Optional parameter specifying the models for fitting the treatment (exposure) mechanism(s) with `gridisl` R package. Must be an object of class `ModelStack` specified with `gridisl::defModel` function.
`models_MONITOR`	Optional parameter specifying the models for fitting the monitoring mechanism with `gridisl` R package. Must be an object of class `ModelStack` specified with `gridisl::defModel` function.
`fit_method`	Model selection approach. Can be `"none"` - no model selection, `"cv"` - V fold cross-validation that selects the best model according to lowest cross-validated MSE (must specify the column name that contains the fold IDs).
`fold_column`	The column name in the input data (ordered factor) that contains the fold IDs to be used as part of the validation sample. Use the provided function `define_CVfolds` to define such folds or define the folds using your own method.
`reg_CENS`	(ADVANCED FEATURE). Manually define and input the regression specification for each strata of censoring model, using the function `define_single_regression`.
`reg_TRT`	(ADVANCED FEATURE). Manually define and input the regression specification for each strata of treatment model, using the function `define_single_regression`.
`reg_MONITOR`	(ADVANCED FEATURE). Manually define and input the regression specification for each strata of monitoring model, using the function `define_single_regression`.
`use_weights`	(NOT IMPLEMENTED) Set to `TRUE` to pass the previously specified weights column to all learners for all of the propensity score models. This will result in a weights regression models being fit for P(A(t)\|L(t)), P(C(t)\|L(t)), P(N(t)\|L(t)). (NOTE: This will only work when using sl3 learners, such as the default sl3 learner `sl3::Lrnr_glm_fast`).
`verbose`	Set to `TRUE` to print messages on status and information to the console. Turn this on by default using `options(stremr.verbose=TRUE)`.
`...`	When all or some of the `models_...` arguments are NOT specified, these additional arguments will be passed on directly to all `gridisl` modeling functions that are called from this routine, e.g., `family = "binomial"` can be used to specify the model family. Note that all such arguments must be named.

...

options(stremr.verbose = TRUE)
require("data.table")

# ----------------------------------------------------------------------
# Simulated Data
# ----------------------------------------------------------------------
data(OdataNoCENS)
OdataDT <- as.data.table(OdataNoCENS, key=c("ID", "t"))

# define lagged N, first value is always 1 (always monitored at the first time point):
OdataDT[, ("N.tminus1") := shift(get("N"), n = 1L, type = "lag", fill = 1L), by = ID]
OdataDT[, ("TI.tminus1") := shift(get("TI"), n = 1L, type = "lag", fill = 1L), by = ID]

# ----------------------------------------------------------------------
# Define intervention (always treated):
# ----------------------------------------------------------------------
OdataDT[, ("TI.set1") := 1L]
OdataDT[, ("TI.set0") := 0L]

# ----------------------------------------------------------------------
# Import Data
# ----------------------------------------------------------------------
OData <- importData(OdataDT, ID = "ID", t = "t", covars = c("highA1c", "lastNat1", "N.tminus1"),
                    CENS = "C", TRT = "TI", MONITOR = "N", OUTCOME = "Y.tplus1")

# ----------------------------------------------------------------------
# Look at the input data object
# ----------------------------------------------------------------------
print(OData)

# ----------------------------------------------------------------------
# Access the input data
# ----------------------------------------------------------------------
get_data(OData)

# ----------------------------------------------------------------------
# Model the Propensity Scores
# ----------------------------------------------------------------------
gform_CENS <- "C ~ highA1c + lastNat1"
gform_TRT = "TI ~ CVD + highA1c + N.tminus1"
gform_MONITOR <- "N ~ 1"
stratify_CENS <- list(C=c("t < 16", "t == 16"))

# ----------------------------------------------------------------------
# Fit Propensity Scores
# ----------------------------------------------------------------------
OData <- fitPropensity(OData, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        gform_MONITOR = gform_MONITOR,
                        stratify_CENS = stratify_CENS)

# ----------------------------------------------------------------------
# IPW Ajusted KM or Saturated MSM
# ----------------------------------------------------------------------
require("magrittr")
AKME.St.1 <- getIPWeights(OData, intervened_TRT = "TI.set1") %>%
             survNPMSM(OData) %$%
             estimates
AKME.St.1

# ----------------------------------------------------------------------
# Bounded IPW
# ----------------------------------------------------------------------
IPW.St.1 <- getIPWeights(OData, intervened_TRT = "TI.set1") %>%
            directIPW(OData)
IPW.St.1[]

# ----------------------------------------------------------------------
# IPW-MSM for hazard
# ----------------------------------------------------------------------
wts.DT.1 <- getIPWeights(OData = OData, intervened_TRT = "TI.set1", rule_name = "TI1")
wts.DT.0 <- getIPWeights(OData = OData, intervened_TRT = "TI.set0", rule_name = "TI0")
survMSM_res <- survMSM(list(wts.DT.1, wts.DT.0), OData, tbreaks = c(1:8,12,16)-1,)
survMSM_res$St

# ----------------------------------------------------------------------
# Sequential G-COMP
# ----------------------------------------------------------------------
t.surv <- c(0:10)
Qforms <- rep.int("Qkplus1 ~ CVD + highA1c + N + lastNat1 + TI + TI.tminus1", (max(t.surv)+1))
params <- gridisl::defModel(estimator = "speedglm__glm")

## Not run: 
gcomp_est <- fit_GCOMP(OData, tvals = t.surv, intervened_TRT = "TI.set1",
                          Qforms = Qforms, models = params, stratifyQ_by_rule = FALSE)
gcomp_est[]

## End(Not run)
# ----------------------------------------------------------------------
# TMLE
# ----------------------------------------------------------------------
## Not run: 
tmle_est <- fit_TMLE(OData, tvals = t.surv, intervened_TRT = "TI.set1",
                    Qforms = Qforms, models = params, stratifyQ_by_rule = TRUE)
tmle_est[]

## End(Not run)

# ----------------------------------------------------------------------
# Running IPW-Adjusted KM with optional user-specified weights:
# ----------------------------------------------------------------------
addedWts_DT <- OdataDT[, c("ID", "t"), with = FALSE]
addedWts_DT[, new.wts := sample.int(10, nrow(OdataDT), replace = TRUE)/10]
survNP_res_addedWts <- survNPMSM(wts.DT.1, OData, weights = addedWts_DT)

# ----------------------------------------------------------------------
# Multivariate Propensity Score Regressions
# ----------------------------------------------------------------------
gform_CENS <- "C + TI + N ~ highA1c + lastNat1"
OData <- fitPropensity(OData, gform_CENS = gform_CENS, gform_TRT = gform_TRT,
                        gform_MONITOR = gform_MONITOR)

# ----------------------------------------------------------------------
# Fitting treatment model with Gradient Boosting machines:
# ----------------------------------------------------------------------
## Not run: 
require("h2o")
h2o::h2o.init(nthreads = -1)
gform_CENS <- "C ~ highA1c + lastNat1"
models_TRT <- sl3::Lrnr_h2o_grid$new(algorithm = "gbm")
OData <- fitPropensity(OData, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        models_TRT = models_TRT,
                        gform_MONITOR = gform_MONITOR,
                        stratify_CENS = stratify_CENS)

# Use `H2O-3` distributed implementation of GLM for treatment model estimator:
models_TRT <- sl3::Lrnr_h2o_glm$new(family = "binomial")
OData <- fitPropensity(OData, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        models_TRT = models_TRT,
                        gform_MONITOR = gform_MONITOR,
                        stratify_CENS = stratify_CENS)

# Use Deep Neural Nets:
models_TRT <- sl3::Lrnr_h2o_grid$new(algorithm = "deeplearning")
OData <- fitPropensity(OData, gform_CENS = gform_CENS,
                        gform_TRT = gform_TRT,
                        models_TRT = models_TRT,
                        gform_MONITOR = gform_MONITOR,
                        stratify_CENS = stratify_CENS)

## End(Not run)

# ----------------------------------------------------------------------
# Fitting different models with different algorithms
# Fine tuning modeling with optional tuning parameters.
# ----------------------------------------------------------------------
## Not run: 
params_TRT <- sl3::Lrnr_h2o_grid$new(algorithm = "gbm",
                              ntrees = 50,
                              learn_rate = 0.05,
                              sample_rate = 0.8,
                              col_sample_rate = 0.8,
                              balance_classes = TRUE)
params_CENS <- sl3::Lrnr_glm_fast$new()
params_MONITOR <- sl3::Lrnr_glm_fast$new()
OData <- fitPropensity(OData,
            gform_CENS = gform_CENS, stratify_CENS = stratify_CENS, params_CENS = params_CENS,
            gform_TRT = gform_TRT, params_TRT = params_TRT,
            gform_MONITOR = gform_MONITOR, params_MONITOR = params_MONITOR)

## End(Not run)

# ----------------------------------------------------------------------
# Running TMLE based on the previous fit of the propensity scores.
# Also applying Random Forest to estimate the sequential outcome model
# ----------------------------------------------------------------------
## Not run: 
t.surv <- c(0:5)
Qforms <- rep.int("Qkplus1 ~ CVD + highA1c + N + lastNat1 + TI + TI.tminus1", (max(t.surv)+1))
models <- sl3::Lrnr_h2o_grid$new(algorithm = "randomForest",
                           ntrees = 100, learn_rate = 0.05, sample_rate = 0.8,
                           col_sample_rate = 0.8, balance_classes = TRUE)
tmle_est <- fit_TMLE(OData, tvals = t.surv, intervened_TRT = "TI.set1",
            Qforms = Qforms, models = models,
            stratifyQ_by_rule = TRUE)

## End(Not run)

## Not run: 
t.surv <- c(0:5)
Qforms <- rep.int("Qkplus1 ~ CVD + highA1c + N + lastNat1 + TI + TI.tminus1", (max(t.surv)+1))
models <- sl3::Lrnr_h2o_grid$new(algorithm = "randomForest",
                           ntrees = 100, learn_rate = 0.05, sample_rate = 0.8,
                           col_sample_rate = 0.8, balance_classes = TRUE)
tmle_est <- fit_TMLE(OData, tvals = t.surv, intervened_TRT = "TI.set1",
            Qforms = Qforms, models = models,
            stratifyQ_by_rule = FALSE)

## End(Not run)

osofr/stremr documentation built on Jan. 25, 2022, 8:07 a.m.

osofr/stremr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

osofr/stremr
Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

fitPropensity: Define and fit propensity score models.
In osofr/stremr: Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

Description

Usage

Arguments

Value

Examples

Related to fitPropensity in osofr/stremr...

R Package Documentation

Browse R Packages

We want your feedback!

osofr/stremr Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

fitPropensity: Define and fit propensity score models. In osofr/stremr: Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

Description

Usage

Arguments

Value

Examples

Related to fitPropensity in osofr/stremr...

R Package Documentation

Browse R Packages

We want your feedback!

osofr/stremr
Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

fitPropensity: Define and fit propensity score models.
In osofr/stremr: Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data