Description Usage Arguments Details Value Author(s) References See Also Examples
Targeted maximum likelihood estimation of parameters of a marginal structural model, and of marginal treatment effects of a binary point treatment on an outcome. In addition to the additive treatment effect, risk ratio and odds ratio estimates are reported for binary outcomes. The tmle
function is generally called with arguments (Y,A,W)
, where Y
is a continuous or binary outcome variable, A
is a binary treatment variable, (A=1
for treatment, A=0
for control), and W
is a matrix or dataframe of baseline covariates. The population mean outcome is calculated when there is no variation in A
. If values of binary mediating variable Z
are supplied, estimates are returned at each level of Z
. Missingness in the outcome is accounted for in the estimation procedure if missingness indicator Delta
is 0 for some observations. Repeated measures can be identified using the id
argument.
1 2 3 4 5 6 7 8 9 10  tmle(Y, A, W, Z=NULL, Delta = rep(1,length(Y)), Q = NULL, Q.Z1 = NULL, Qform = NULL,
Qbounds = NULL, Q.SL.library = c("SL.glm", "tmle.SL.dbarts2", "SL.glmnet"),
cvQinit = TRUE, g1W = NULL, gform = NULL,
gbound = 5/sqrt(length(Y))/log(length(Y)), pZ1=NULL,
g.Zform = NULL, pDelta1 = NULL, g.Deltaform = NULL,
g.SL.library = c("SL.glm", "tmle.SL.dbarts.k.5", "SL.gam"),
g.Delta.SL.library = c("SL.glm", "tmle.SL.dbarts.k.5", "SL.gam"),
family = "gaussian", fluctuation = "logistic", alpha = 0.9995, id=1:length(Y), V = 5,
verbose = FALSE, Q.discreteSL=FALSE, g.discreteSL=FALSE, g.Delta.discreteSL=FALSE,
prescreenW.g=TRUE, min.retain = 2, RESID=FALSE, target.gwt = TRUE, automate=FALSE)

Y 
continuous or binary outcome variable 
A 
binary treatment indicator, 
W 
vector, matrix, or dataframe containing baseline covariates 
Z 
optional binary indicator for intermediate covariate for controlled direct effect estimation 
Delta 
indicator of missing outcome or treatment assignment. 
Q 
optional nx2 matrix of initial values for Q portion of the likelihood, (E(YA=0,W), E(YA=1,W)) 
Q.Z1 
optional nx2 matrix of initial values for Q portion of the likelihood, (E(YZ=1,A=0,W), E(YZ=1,A=1,W)). (When specified, values for E(YZ=0,A=0,W), E(YZ=0,A=1,W) are passed in using the 
Qform 
optional regression formula for estimation of E(YA,W), suitable for call to 
Qbounds 
vector of upper and lower bounds on 
Q.SL.library 
optional vector of prediction algorithms to use for 
cvQinit 
logical, if 
g1W 
optional vector of conditional treatment assingment probabilities, P(A=1W) 
gform 
optional regression formula of the form 
gbound 
value between (0,1) for truncation of predicted probabilities. See 
pZ1 
optionalnx2 matrix of conditional probabilities P(Z=1A=0,W), P(Z=1A=1,W) 
g.Zform 
optional regression formula of the form 
pDelta1 
optional matrix of conditional probabilities for missingness mechanism, nx2 when 
g.Deltaform 
optional regression formula of the form 
g.SL.library 
optional vector of prediction algorithms to use for 
g.Delta.SL.library 
optional vector of prediction algorithms to use for 
family 
family specification for working regression models, generally ‘gaussian’ for continuous outcomes (default), ‘binomial’ for binary outcomes 
fluctuation 
‘logistic’ (default), or ‘linear’ 
alpha 
used to keep predicted initial values bounded away from (0,1) for logistic fluctuation 
id 
optional subject identifier 
V 
Number of crossvalidation folds for estimating Q, and for super learner estimation of g 
verbose 
status messages printed if set to 
Q.discreteSL 
if TRUE, discreteSL is used instead of ensemble SL. Ignored when SL not used to estimate Q 
g.discreteSL 
if TRUE, discreteSL is used instead of ensemble SL. Ignored when SL not used to estimate g1W 
g.Delta.discreteSL 
if TRUE, discreteSL is used instead of ensemble SL. Ignored when SL not used to estimate P(Delta = 1  A,W) 
prescreenW.g 
Screen covariates before estimating g in order to retain only those associated with Stage 1 residuals 
min.retain 
Minimum number of covariates to retain when prescreening covariates for g. Ignored when prescreenW.g=FALSE 
RESID 
Flag indicating whether to retain covariates associated with the outcome RESID=FALSE, or associated only with the residuals from the outcome regression. Ignored when prescreenW.g=FALSE 
target.gwt 
When TRUE, move g from denominator of clever covariate to the weight when fitting epsilon 
automate 
When TRUE, all tuning parameters are set to their default values. Number of cross validation folds and truncation level for g are set dataadaptively based on sample size (see details). 
gbounds
Lower bound defaults to lb = 5/sqrt(n)/log(n). For treatment effect estimates and population mean outcome the upper bound defaults to 1. For ATT and ATC, the upper bound defaults to 1 lb.
W
may contain factors. These are converted to indicators via a call to model.matrix
.
Controlled direct effects are estimated when binary covariate Z
is nonnull. The tmle function returns an object of class tmle.list
, a list of two items of class tmle
. The first corresponds to estimates obtained when Z
is fixed at 0, the second corresponds to estimates obtained when Z
is fixed at 1.
When automate = TRUE the sample size determines the number of cross validation folds, V: n.effective = n for continuous Y
, and 5 * size of minority class for binary Y
. When n.effective <= 30, V= n.effective; When n.effective <= 500, V= 20; When 500 < n <=1000 V=10; When 1000 < n <= 10000 V=5; Otherwise V=2. Bounds on g
set to (5/sqrt(n)/log(n), 1), except for ATT and ATE, where upper bound is 1lower bound.
estimates 
list with elements EY1 (population mean), ATE (additive treatment effect), ATT (additive treatment effect among the treated), ATC (additive treatment effect among the controls), RR (relative risk), OR (odds ratio). Each element in the estimates of these is itself a list containing

Qinit 
initial estimate of 
Qstar 
targeted estimate of 
g 
treatment mechanism estimate. A list with four items: 
g.Z 
intermediate covariate assignment estimate (when applicable). A list with four items: 
g.Delta 
missingness mechanism estimate. A list with four items: 
gbound 
bounds used to truncate g 
gbound.ATT 
bounds used to truncated g for ATT and ATC estimation 
W.retained 
names of covariates used to model the components of g 
Susan Gruber sgruber@cal.berkeley.edu, in collaboration with Mark van der Laan.
1. Gruber, S. and van der Laan, M.J. (2012), tmle: An R Package for Targeted Maximum Likelihood Estimation. Journal of Statistical Software, 51(13), 135. https://www.jstatsoft.org/v51/i13/
2. Gruber, S. and van der Laan, M.J. (2009), Targeted Maximum Likelihood Estimation: A Gentle Introduction. U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 252. https://biostats.bepress.com/ucbbiostat/paper252/
3. Gruber, S. and van der Laan, M.J. (2010), A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome. The International Journal of Biostatistics, 6(1), 2010.
4. Rosenblum, M. and van der Laan, M.J. (2010).Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model. The International Journal of Biostatistics, 6(2), 2010.
5. van der Laan, M.J. and Rubin, D. (2006), Targeted Maximum Likelihood Learning. The International Journal of Biostatistics, 2(1). https://biostats.bepress.com/ucbbiostat/paper252/
6. van der Laan, M.J., Rose, S., and Gruber,S., editors, (2009) Readings in Targeted Maximum Likelihood Estimation . U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 254. https://biostats.bepress.com/ucbbiostat/paper254/
7. van der Laan, M.J. and Gruber S. (2016), OneStep Targeted Minimum Lossbased Estimation Based on Universal Least Favorable OneDimensional Submodels. The International Journal of Biostatistics, 12 (1), 351378.
summary.tmle
,
estimateQ
,
estimateG
,
calcParameters
,
oneStepATT
,
tmleMSM
,
calcSigma
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50  library(tmle)
set.seed(1)
n < 250
W < matrix(rnorm(n*3), ncol=3)
A < rbinom(n,1, 1/(1+exp((.2*W[,1]  .1*W[,2] + .4*W[,3]))))
Y < A + 2*W[,1] + W[,3] + W[,2]^2 + rnorm(n)
# Example 1. Simplest function invocation
# SuperLearner called to estimate Q, g
# Delta defaults to 1 for all observations
## Not run:
result1 < tmle(Y,A,W)
summary(result1)
## End(Not run)
# Example 2:
# Usersupplied regression formulas to estimate Q and g
# binary outcome
n < 250
W < matrix(rnorm(n*3), ncol=3)
colnames(W) < paste("W",1:3, sep="")
A < rbinom(n,1, plogis(0.6*W[,1] +0.4*W[,2] + 0.5*W[,3]))
Y < rbinom(n,1, plogis(A + 0.2*W[,1] + 0.1*W[,2] + 0.2*W[,3]^2 ))
result2 < tmle(Y,A,W, family="binomial", Qform=Y~A+W1+W2+W3, gform=A~W1+W2+W3)
summary(result2)
## Not run:
# Example 3: Population mean outcome
# Usersupplied (misspecified) model for Q,
# Super learner called to estimate g, g.Delta
# V set to 2 for demo, not recommended at this sample size
# approx. 20
Y < W[,1] + W[,2]^2 + rnorm(n)
Delta < rbinom(n, 1, 1/(1+exp((1.71*W[,1]))))
result3 < tmle(Y,A=NULL,W, Delta=Delta, Qform="Y~A+W1+W2+W3", V=2)
print(result3)
# Example 4: Controlled direct effect
# Usersupplied models for g, g.Z
# V set to 2 for demo, not recommended at this sample size
A < rbinom(n,1,.5)
Z < rbinom(n, 1, plogis(.5*A + .1*W[,1]))
Y < 1 + A + 10*Z + W[,1]+ rnorm(n)
CDE < tmle(Y,A,W, Z, gform="A~1", g.Zform = "Z ~ A + W1", V=2)
print(CDE)
total.effect < tmle(Y,A, W, gform="A~1")
print(total.effect)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.