sg.cvtmle | R Documentation |
This function uses a cross-validated targeted minimum loss-based estimator (CV-TMLE) to evaluate the impact of treating the optimal subgroup versus following a user-specified static treatment strategy.
sg.cvtmle(W, A, Y, SL.library, Delta = rep(1,length(A)), OR.SL.library = SL.library,
prop.SL.library = SL.library, missingness.SL.library = SL.library, txs = c(0, 1),
baseline.probs = c(0.5, 0.5), kappa = 1, g0 = NULL, Q0 = NULL,
family = binomial(), sig.trunc = 1e-10, alpha = 0.05,
num.folds = 10, num.SL.rep = 5, SL.method = "method.NNLS2",
num.est.rep = 5, id = NULL, folds = NULL, obsWeights = NULL,
stratifyCV = FALSE, RR = FALSE, lib.ests = FALSE,
init.ests.out = FALSE, init.ests.in = NULL, verbose = TRUE, ...)
W |
data frame with observations in the rows and baseline covariates used to form the subgroup in columns. |
A |
numeric treatment vector. Treatments of interest specified using the |
Y |
real-valued outcome for which large values are preferred (if use relative risk contrast, then Y should be an indicator of the absence of an adverse event, and the relative risk returned is the relative risk of the adverse event). |
SL.library |
SuperLearner library (see documentation for |
Delta |
Vector of the same length as |
OR.SL.library |
SuperLearner library (see documentation for |
prop.SL.library |
SuperLearner library (see documentation for |
missingness.SL.library |
SuperLearner library (see documentation for |
txs |
A vector indicating the two or more treatments of interest in A that will be used for the treatment assignment problem. The treatments in |
baseline.probs |
A vector of the same lengths as txs indicating the (stochastic) treatment rule to use as a baseline when evaluating performance of the estimated optimal treatment rule. In this treatment rule, the |
kappa |
maximum allowable probability of treating a randomly drawn individual in the population with the first treatment in |
g0 |
if known (as in a randomized controlled trial), a matrix of probabilities of receiving the treatment corresponding to entry |
Q0 |
a user-supplied list of matrices of estimates of the mean outcome of |
family |
|
sig.trunc |
value at which the standard deviation estimate is truncated. |
alpha |
confidence level for returned confidence interval set to (1-alpha)*100%. |
num.folds |
number of folds to use in cross-validation step of the CV-TMLE. |
num.SL.rep |
number of super-learner repetitions (increasing this number should make the algorithm more stable across seeds). |
SL.method |
method that the SuperLearner function uses to select a convex combination of learners |
num.est.rep |
number of repetitions of estimator, minimizing variation over cross-validation fold assignment (increasing this number should make the algorithm more stable across seeds) |
id |
optional cluster identification variable. Will ensure rows with same id remain in same validation fold each time cross-validation used |
folds |
folds to be used when performing cross-validation step of the CV-TMLE. Should be in the same format as the output of |
obsWeights |
observation weights |
stratifyCV |
stratify validation folds by event counts (does this for estimation of outcome regression, treatment mechanism, and conditional average treatment effect function). Useful for rare outcomes |
RR |
estimates relative risk (TRUE) or additive contrast (FALSE) between the mean outcome under optimal versus randomizing treatment via a fair coin toss. For relative risk, estimates the additive outcome of Y not occurring (since throughut we assume Y is beneficial) |
lib.ests |
Also return estimates based on candidate optimal rule estimates in the super-learner library |
init.ests.out |
Set this option to TRUE to return the initial SuperLearner estimates. Can be fed to a new call of this function using init.ests.in to speed up that call. E.g., useful if want to call this function at many values of |
init.ests.in |
Can be used to feed the function the initial SuperLearner estimates from a previous call of this function (see |
verbose |
give status updates |
CV-TMLE to evaluate the impact of treating the optimal subgroup versus following a user-specified static treatment strategy.
Coverage of the upper confidence bound relies on being able to estimate the optimal subgroup well in terms of mean outcome (see the cited papers).
We do not have any theoretical justication for the CV-TMLE confidence interval when the treatment effect falls on the decision boundary with positive probability (decision boundary is zero), though we have seen that it performs well in simulations.
a list containing
est |
Vector containing estimates of the impact of treating the optimal subgroup. Items in the vector correspond to different choices of algorithms for estimating the optimal treatment rule (if |
ci |
Matrix containing confidence intervals for the impact of treating the optimal subgroup. Left column contains lower bounds, right column contains upper bounds. Rows correspond to different choices of algorithms for estimating the optimal treatment rule (if |
est.mat |
Estimates across repetitions. |
“Evaluating the Impact of Treating the Optimal Subgroup,” technical report to be released soon.
M. J. van der Laan and A. R. Luedtke, “Targeted learning of the mean outcome under an optimal dynamic treatment rule,” Journal of Causal Inference, vol. 3, no. 1, pp. 61-95, 2015.
SL.library = c('SL.mean','SL.glm')
Qbar = function(a,w){plogis((a==1)*w$W1 - (a==2)*w$W2 + (a==0))}
n = 500
W = data.frame(W1=rnorm(n),W2=rnorm(n),W3=rnorm(n),W4=rnorm(n))
A = rbinom(n,1,1/2) + rbinom(n,1,1/2)
Y = rbinom(n,1,Qbar(A,W))
# comparing the mean outcome under the optimal rule to the mean outcome
# when treating half of the population at random
sg.cvtmle(W,A,Y,baseline.probs=c(0.5,0.5),SL.library=SL.library,num.SL.rep=2,num.folds=5,family=binomial())
# same as above, but adding ids (used in CV splits) and in observation weights
sg.cvtmle(W,A,Y,SL.library=SL.library,txs=c(0,1,2),baseline.probs=c(0.5,0.5,0),num.SL.rep=2,num.folds=5,family=binomial(),id=rep(1:(n/2),2),obsWeights=1+3*runif(n))
# comparing the mean outcome under the optimal rule against the mean outcome under treating no one
# when only treatments 0 or 1 can be assigned
sg.cvtmle(W,A,Y,baseline.probs=c(1,0),txs=c(0,1),SL.library=SL.library,num.SL.rep=2,num.folds=5,family=binomial(),sig.trunc=0.001)
# comparing the mean outcome under an optimal rule that treats at most 25 percent of people
# with treatment 0 to the mean outcome under treating 25 percent of people at random
sg.cvtmle(W,A,Y,baseline.probs=c(0.25,0.375,0.375),SL.library=SL.library,txs=c(0,1,2),num.SL.rep=2,num.folds=5,kappa=0.25,family=binomial())
# estimating the mean outcomes under optimal rules that treats at most prop percent of people
# with treatment 0
out_10 = sg.cvtmle(W,A,Y,txs=c(0,1,2),baseline.probs=c(0,0,0),SL.library=SL.library,num.SL.rep=2,num.folds=5,kappa=0.10,family=binomial(),init.ests.out=TRUE)
init.ests = out_10$init.ests
for(prop in seq(0.10,0.7,by=0.2)){
print(paste0("Can treat a ",prop," proportion of population with treatment 0."))
out = sg.cvtmle(W,A,Y,txs=c(0,1,2),baseline.probs=c(0,0,0),SL.library=SL.library,num.SL.rep=2,num.folds=5,kappa=prop,family=binomial(),init.ests.out=FALSE,init.ests.in=init.ests,verbose=FALSE)
print(out$est)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.