knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", cache = FALSE, out.width = "70%" )
targeted) 
Various methods for targeted learning and semiparametric inference including
augmented inverse probability weighted (AIPW) estimators for missing data and
causal inference (Bang and Robins (2005)
You can install the released version of targeted from CRAN with:
install.packages("targeted")
And the development version from GitHub with:
remotes::install_github("kkholst/targeted", ref="dev")
Computations such as cross-validation are parallelized via the {future}
package. To enable parallel computations and progress-bars the following code
can be executed
future::plan("multisession") progressr::handlers(global=TRUE)
To illustrate some of the functionality of the targeted package we simulate
some data from the following model
$$Y = \exp{-(W_1 - 1)^2 - (W_2 - 1)^2)} - 2\exp{-(W_1+1)^2 - (W_2+1)^2}A + \epsilon$$
with independent measurement error $\epsilon\sim\mathcal{N}(0,1)$, treatment variable $A \sim Bernoulli(\text{expit}{-1+W_1})$ and
independent covariates $W_1, W_2\sim\mathcal{N}(0,1/2)$.
library("targeted") simdata <- function(n, ...) { w1 <- rnorm(n) # covariates w2 <- rnorm(n) # ... a <- rbinom(n, 1, plogis(-1 + w1)) # treatment indicator y <- exp(- (w1 - 1)**2 - (w2 - 1)**2) - # continuous response 2 * exp(- (w1 + 1)**2 - (w2 + 1)**2) * a + # additional effect in treated rnorm(n, sd=0.5**.5) data.frame(y, a, w1, w2) } set.seed(1) d <- simdata(5e3) head(d)
wnew <- seq(-3,3, length.out=200) dnew <- expand.grid(w1 = wnew, w2 = wnew, a = 1) y <- with(dnew, exp(- (w1 - 1)**2 - (w2 - 1)**2) - 2 * exp(- (w1 + 1)**2 - (w2 + 1)**2)*a ) image(wnew, wnew, matrix(y, ncol=length(wnew)), col=viridisLite::viridis(64), main=expression(paste("E(Y|",W[1],",",W[2],")")), xlab=expression(W[1]), ylab=expression(W[2]))
Methods for targeted and semiparametric inference rely on fitting nuisance
models to observed data when estimating the target parameter of interest. The
{targeted} package implements the R6 reference class
learner to harmonize common statistical and machine learning models for the
usage as nuisance models across the various implemented estimators, such as the
targeted:cate function. Commonly used models are constructed as learner
class objects through the learner_* functions.
As an example, we can specify a linear regression model with an interaction term between treatment and the two covariates $W_1$ and $W_2$
lr <- learner_glm(y ~ (w1 + w2)*a, family = gaussian) lr
To fit the model to the data we use the estimate method
lr$estimate(d) lr$fit
Predictions, $E(Y\mid W_1, W_2)$, can be performed with the predict method
head(d) |> lr$predict()
pr <- matrix(lr$predict(dnew), ncol=length(wnew)) image(wnew, wnew, pr, col=viridisLite::viridis(64), main=expression(paste("E(Y|",W[1],",",W[2],")")), xlab=expression(W[1]), ylab=expression(W[2]))
Similarly, a Random Forest can be specified with
lr_rf <- learner_grf(y ~ w1 + w2 + a, num.trees = 500)
Lists of models can also be constructed for different hyper-parameters with the
learner_expand_grid function.
To assess the model generalization error we can perform $k$-fold cross-validation with the cv method
mod <- list(glm = lr, rf = lr_rf) cv(mod, data = d, rep = 2, nfolds = 5) |> summary()
An ensemble learner (super-learner) can easily be constructed from lists of learner objects
sl <- learner_sl(mod, nfolds = 10) sl$estimate(d) sl
pr <- matrix(sl$predict(dnew), ncol=length(wnew)) image(wnew, wnew, pr, col=viridisLite::viridis(64), main=expression(paste("E(Y|",W[1],",",W[2],")")), xlab=expression(W[1]), ylab=expression(W[2]))
In the following we are interested in estimating the target parameter $\psi_a(P) = E_P[Y(a)]$, where $Y(a)$ is the potential outcome we would have observed if treatment $a$ had been administered, possibly contrary to the actual treatment that was observed, i.e., $Y = Y(A)$. To assess the treatment effect we can then the consider the average treatment effect (ATE) $$E_P[Y(1)]-E_P[Y(0)],$$ or some other contrast of interest $g(\psi_1(P), \psi_0(P))$. Under the following assumptions
1) Stable Unit Treatment Values Assumption (the treatment of a specific subject is not affecting the potential outcome of other subjects) 2) Positivity, $P(A\mid W)>\epsilon$ for some $\epsilon>0$ and baseline covariates $W$ 3) No unmeasured confounders, $Y(a)\perp !!! \perp A|W$
then the target parameter can be identified from the observed data distribution as $$E(E[Y|W,A=a]) = E(E[Y(a)|W]) = E[Y(a)]$$ or $$E[Y I(A=a)/P(A=a|W)] = E[Y(a)].$$
This suggests estimators based on outcome regression ($g$-computation) or inverse probability weighting. More generally, under the above assumption we can constructor a one-step estimator from the Efficient Influence Function combining these two $$ E\left[\frac{I(A=a)}{\Pi_a(W)}(Y-Q(W,A)) + Q(W,a)\right].$$\ In practice, this requires plugin estimates of both the outcome model, $Q(W,A) := E(Y\mid A, W)$, and of the treatment propensity model $\Pi_a(W) := P(A=a\mid W)$. The corresponding estimator is consistent even if just one of the two nuisance models is correctly specified.
First we specify the propensity model
prmod <- learner_glm(a ~ w1 + w2, family=binomial)
We will reuse one of the outcome models from the previous section, and use the
cate function to estimate the treatment effect
a <- cate(response.model = lr_rf, propensity.model = prmod, data = d, nfolds = 5) a
In the output we get estimates of both the mean potential outcomes and the
difference of those, the average treatment effect, given as the term (Intercept).
summary(a)
Here we use the nfolds=5 argument to use 5-fold cross-fitting to guarantee
that the estimates converges weakly to a Gaussian distribution even though
that the estimated influence function based on plugin estimates from the Random
Forest does not necessarily lie in a $P$-Donsker class.
We use the dev branch for development and the main branch for stable
releases. All releases follow semantic versioning, are
tagged and notable
changes are reported in
NEWS.md.
If you want to ask questions, require help or clarification, or report a bug, we recommend to either contact a maintainer directly or the following:
We will then take care of the issue as soon as possible.
targetedAll types of contributions are encouraged and valued. See the CONTRIBUTING.md for details about how to contribute code to this project.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.