knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
txshift
Efficient Estimation of the Causal Effects of Stochastic Interventions
Authors: Nima Hejazi and David Benkeser
txshift
?The txshift
R package is designed to provide facilities for the construction
of efficient estimators of the counterfactual mean of an outcome under
stochastic interventions that depend on the natural value of treatment
[@diaz2012population; @haneuse2013estimation]. txshift
implements and builds
upon a simplified algorithm for the targeted maximum likelihood (TML) estimator
of such a causal parameter, originally proposed by @diaz2018stochastic, and
makes use of analogous machinery to compute an efficient one-step estimator
[@pfanzagl1985contributions]. txshift
integrates with the sl3
package [@coyle-sl3-rpkg] to allow for ensemble
machine learning to be leveraged in the estimation procedure.
For many practical applications (e.g., vaccine efficacy trials), observed data
is often subject to a two-phase sampling mechanism (i.e., through the use of a
two-stage design). In such cases, efficient estimators (of both varieties) must
be augmented to construct unbiased estimates of the population-level causal
parameter. @rose2011targeted2sd first introduced an augmentation procedure that
relies on introducing inverse probability of censoring (IPC) weights directly to
an appropriate loss function or to the efficient influence function estimating
equation. txshift
extends this approach to compute IPC-weighted one-step and
TML estimators of the counterfactual mean outcome under a shift stochastic
treatment regime. The package is designed to implement the statistical
methodology described in @hejazi2020efficient and extensions thereof.
For standard use, we recommend installing the package from CRAN via
install.packages("txshift")
Note: If txshift
is installed from
CRAN, the sl3
, an enhancing
dependency that allows ensemble machine learning to be used for nuisance
parameter estimation, won't be included. We highly recommend additionally
installing sl3
from GitHub via
remotes
:
remotes::install_github("tlverse/sl3@master")
For the latest features, install the most recent stable version of txshift
from GitHub via remotes
:
remotes::install_github("nhejazi/txshift@master")
To contribute, install the development version of txshift
from GitHub via
remotes
:
remotes::install_github("nhejazi/txshift@devel")
To illustrate how txshift
may be used to ascertain the effect of a treatment,
consider the following example:
library(txshift) library(sl3) set.seed(429153) # simulate simple data n_obs <- 500 W <- replicate(2, rbinom(n_obs, 1, 0.5)) A <- rnorm(n_obs, mean = 2 * W, sd = 1) Y <- rbinom(n_obs, 1, plogis(A + W + rnorm(n_obs, mean = 0, sd = 1))) # now, let's introduce a a two-stage sampling process C_samp <- rbinom(n_obs, 1, plogis(W + Y)) # fit the full-data TMLE (ignoring two-phase sampling) tmle <- txshift( W = W, A = A, Y = Y, delta = 0.5, estimator = "tmle", g_exp_fit_args = list( fit_type = "sl", sl_learners_density = Lrnr_density_hse$new(Lrnr_hal9001$new()) ), Q_fit_args = list(fit_type = "glm", glm_formula = "Y ~ .") ) tmle # fit a full-data one-step estimator for comparison (again, no sampling) os <- txshift( W = W, A = A, Y = Y, delta = 0.5, estimator = "onestep", g_exp_fit_args = list( fit_type = "sl", sl_learners_density = Lrnr_density_hse$new(Lrnr_hal9001$new()) ), Q_fit_args = list(fit_type = "glm", glm_formula = "Y ~ .") ) os # fit an IPCW-TMLE to account for the two-phase sampling process tmle_ipcw <- txshift( W = W, A = A, Y = Y, delta = 0.5, C_samp = C_samp, V = c("W", "Y"), estimator = "tmle", max_iter = 5, eif_reg_type = "glm", samp_fit_args = list(fit_type = "glm"), g_exp_fit_args = list( fit_type = "sl", sl_learners_density = Lrnr_density_hse$new(Lrnr_hal9001$new()) ), Q_fit_args = list(fit_type = "glm", glm_formula = "Y ~ .") ) tmle_ipcw # compare with an IPCW-agumented one-step estimator under two-phase sampling os_ipcw <- txshift( W = W, A = A, Y = Y, delta = 0.5, C_samp = C_samp, V = c("W", "Y"), estimator = "onestep", eif_reg_type = "glm", samp_fit_args = list(fit_type = "glm"), g_exp_fit_args = list( fit_type = "sl", sl_learners_density = Lrnr_density_hse$new(Lrnr_hal9001$new()) ), Q_fit_args = list(fit_type = "glm", glm_formula = "Y ~ .") ) os_ipcw
If you encounter any bugs or have any specific feature requests, please file an issue. Further details on filing issues are provided in our contribution guidelines.
Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.
After using the txshift
R package, please cite the following:
@article{hejazi2020efficient, author = {Hejazi, Nima S and {van der Laan}, Mark J and Janes, Holly E and Gilbert, Peter B and Benkeser, David C}, title = {Efficient nonparametric inference on the effects of stochastic interventions under two-phase sampling, with applications to vaccine efficacy trials}, year = {2021}, doi = {10.1111/biom.13375}, url = {https://doi.org/10.1111/biom.13375}, journal = {Biometrics}, publisher = {Wiley Online Library} } @article{hejazi2020txshift-joss, author = {Hejazi, Nima S and Benkeser, David C}, title = {{txshift}: Efficient estimation of the causal effects of stochastic interventions in {R}}, year = {2020}, doi = {10.21105/joss.02447}, url = {https://doi.org/10.21105/joss.02447}, journal = {Journal of Open Source Software}, publisher = {The Open Journal} } @software{hejazi2022txshift-rpkg, author = {Hejazi, Nima S and Benkeser, David C}, title = {{txshift}: Efficient Estimation of the Causal Effects of Stochastic Interventions}, year = {2022}, doi = {10.5281/zenodo.4070042}, url = {https://CRAN.R-project.org/package=txshift}, note = {R package version 0.3.9} }
R/tmle3shift
- An R package that
is an independent implementation of the same core methodology for TML
estimation as provided here but written based on the
tmle3
engine of the tlverse
ecosystem. Unlike txshift
, this package does
not provide tools for estimation under two-phase sampling designs.
R/medshift
- An experimental R
package for estimating causal mediation effects with stochastic interventions,
including via inverse probability weighted and asymptotically efficient
one-step estimators, as first described in @diaz2020causal.
R/haldensify
- An R package for
estimating the generalized propensity score (conditional density) nuisance
parameter using the highly adaptive
lasso [@coyle-hal9001-rpkg;
@hejazi2020hal9001-joss] via an application of pooled hazard regression
[@diaz2011super].
R/lmtp
- An R package for estimating
the causal effects of longitudinal modified treatment policies, which are a
generalization of the type of effect considered in this package. The LMTP
framework was first introduced in @diaz2021nonparametric and the lmtp
package is described in @williams2023lmtp.
The development of this software was supported in part through grants from the National Library of Medicine (award no. T32 LM012417), the National Institute of Allergy and Infectious Diseases (award no. R01 AI074345), and the National Science Foundation (award no. DMS 2102840).
© 2017-2024 Nima S. Hejazi
The contents of this repository are distributed under the MIT license. See below for details:
MIT License Copyright (c) 2017-2024 Nima S. Hejazi Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.