Description Usage Arguments Details Value IPTW estimator TMLE estimator GCOMP estimator Modeling P(A | W, E) for covariates (A, W, E) Three methods for choosing bin (interval) locations for a univariate and continuous variable A See Also Examples
View source: R/tmleCommunity.R
Estimate the marginal treatment effect among independent communities (or i.i.d units if no hierarchical structure) using TMLE (targeted maximum likelihood estimation). It supports two different TMLEs that are based on community-level and individual-level analysis, respectively. The individual-level TMLE cooperates with additional working assumptions and has potential efficiency gain. It also provide corresponding IPTW (the inverse-probability-of-treatment or Horvitz-Thompson) and GCOMP (parametric G-computation).
| 1 2 3 4 5 6 7 8 9 10 11 12 | tmleCommunity(data, Ynode, Anodes, WEnodes, YnodeDet = NULL,
  obs.wts = c("equal.within.pop", "equal.within.community"),
  community.step = c("NoCommunity", "community_level",
  "individual_level", "perCommunity"), communityID = NULL,
  community.wts = c("size.community", "equal.community"),
  pooled.Q = FALSE, f_g0 = NULL, f_gstar1, f_gstar2 = NULL,
  Qform = NULL, Qbounds = NULL, alpha = 0.995,
  fluctuation = "logistic", hform.g0 = NULL, hform.gstar = NULL,
  lbound = 0.005, h.g0_GenericModel = NULL,
  h.gstar_GenericModel = NULL, TMLE.targetStep = c("tmle.intercept",
  "tmle.covariate"), n_MCsims = 1, CI_alpha = 0.05, rndseed = NULL,
  verbose = getOption("tmleCommunity.verbose"))
 | 
| data | Observed data,  | 
| Ynode | Column name or index in  | 
| Anodes | Column names or indices in  | 
| WEnodes | Column names or indices in  | 
| YnodeDet | Optional column name or index in  | 
| obs.wts | Optional choice to provide/ construct a vector of individual-level observation (sampling) weights (of length  | 
| community.step | Methods to deal with hierarchical data, one of  | 
| communityID | Optional column name or index in  | 
| community.wts | Optional choice to provide/ construct a matrix of community-level observation weights (where dimension = J\times2, where 
J = the number of communities). The first column contains the identifiers / names of communities (ie.,  | 
| pooled.Q | Logical for incorporating hierarchical data to estimate the outcome mechanism. If  | 
| f_g0 | Optional function used to specify model knowledge about value of Anodes. It estimates P(A | W, E) under  | 
| f_gstar1 | Either a function or a vector or a matrix/ data frame of counterfactual exposures, dependin on the number of exposure variables.
If a matrix/ data frame, its number of rows must be either  | 
| f_gstar2 | Either a function or a vector or a matrix/ data frame of counterfactual exposures, dependin on the number of exposure variables. It has the same components and requirements as f_gstar1 has. | 
| Qform | Character vector of regression formula for Ynode. If not specified (i.e.,  | 
| Qbounds | Vector of upper and lower bounds on Y and predicted value for initial Q. Default to the range of Y, widened by 10% of the min and max values. See "Details". | 
| alpha | Used to keep predicted values for initial Q bounded away from (0,1) for logistic fluctuation 
(set  | 
| fluctuation | Default to "logistic", it could also be "linear" (for targeting step). | 
| hform.g0 | Character vector of regression formula for estimating the conditional density of P(A | W, E) under observed treatment mechanism
g0. If not specified, its form will be  | 
| hform.gstar | Character vector of regression formula for estimating the conditional density P(A | W, E) under user-supplied interventions f_gstar1 or f_gstar2. If not specified, it use the same regression formula as used in hform.g0. | 
| lbound | Value between (0,1) for truncation of predicted P(A | W, E). Default to 0.005 | 
| h.g0_GenericModel | An object of  | 
| h.gstar_GenericModel | An object of  | 
| TMLE.targetStep | TMLE targeting step method, either "tmle.intercept" (Default) or "tmle.covariate". See "Details". | 
| n_MCsims | Number of simulations for Monte-Carlo analysis. Each simulation generates new exposures under f_gstar1 or f_gstar2 (if specified) or f_g0 (if specified), with a sample size of nrow(data). Then these generated expsosures are used when fitting the conditional densities P(A | W, E). and estimating for IPTW and GCOMP under intervention f_gstar1 or f_gstar2. Note that deterministic intervention only needs one simulation and stochastic intervention could use more simulation times such as 10 (Default to 1). | 
| CI_alpha | Significance level (alpha) used in constructing a confidence interval. Default to 0.05. | 
| rndseed | Random seed for controlling sampling A under f_gstar1 or f_gstar2 (for reproducibility of Monte-Carlo simulations) | 
| verbose | Flag. If  | 
The estimates returned by tmleCommunity are of a treatment-specific mean, E[Y_{g^*}], the expected community-level outcome if all 
communities in the target population received the exposures generated under the user-supplied (deterministic or stochastic) intervention g^*. 
data must be a data frame (no matrix accepted) and does not support factor values (considering removing or recording such columns as strings), 
data includes the following (optional) columns:
 community-level and individual-level baseline covariate columns (WEnodes): can be any numeric data. Notice that W represents 
individual-level covariates and E represent community-level covariates. 
 exposure columns (Anodes): can have more than one exposure and each exposure could be can be either binary, categorical or continuous. 
 outcome column (Ynode): can be any numeric data. If Ynode values are continuous, they may be automatically scaled. 
See details for Qbounds below.  
 deterministic Ynode indicator column (YnodeDet): (optional) column that must be logical or binary. Only non-determinstic  
Ynode values will be used in the final estimation step (e.g., IPTW[!determ.Q] <- Y[!determ.Q] * h_wts[!determ.Q]). 
 community identifier variable column (communityID): (optional) column that stores community identifier for hierarchical data 
(or subject identifier if multiple observations for the same individual). Integer or character recommended (No factor allowed). 
NULL means no hierarchical structure and all distinct individuals.
community.step specifies how to read the structure of the input data (as hierarchical or non-hierarchical) and how to analyze the data. 
community_level TMLE ("community_level") is exactly analogous to the TMLE of the treatment specific individual-level mean outcome 
("NoCommunity") with the trivial modification that the community rather than the individual serves as the unit of analysis.  
communityID is needed when using "community_level", "individual_level" and "perCommunity". Lack of  
communityID forces the algorithm to automatically pool data over all communities and treat it as non-hierarchical dataset (so force   
community.step = "NoCommunity"). If community.step = "individual_level", it incorporates with working models that 
assume that each individual's outcome is known not to be affected by the covariates of other individuals in the same community (i.e., "no   
covariate interference"). This strong assumption can be relaxed by integrating knowledge of the dependence structure among individuals  
within communities (i.e., "weak covariate interference"). But currrently only the "no covariate interference" assumption is implemented. 
If community.step = "perCommunity"), run a single TMLE on each community and calculate a (weighted) mean outcome for the J  
communities.
pooled.Q is in regard to incorporate the working model of "no covariate inference" in community-level TMLE (and the 
corresponding IPTW and GCOMP) although the working model is not assumed to hold. In other words, when community.step 
= "community_level", if pooled.Q = TRUE, add pooled individual-level regressions as candidates in the Super Learner  
library for initial estimation of the outcome mechanism. If pooled.Q = FALSE, both outcome and treatment mechanisms are
estimated on the community-level (no use of individual-level information). 
Qform should be NULL, in which cases all parent nodes of Y node will be used as regressors, or a character vector that can  
be coerced to class "formula". If Qestimator (an argument in tmleCom_Options) is "speedglm__glm" (or  
"glm__glm"), then speedglm (or glm) will be called using the components of Qform. If Qestimator 
is "SuperLearner", then SuperLearner will be called after a data frame is created using Qform, based on the specified
algorithms in SL.library (an argument in tmleCom_Options); 
If Qestimator is "h2o__ensemble", then h2o and h2oEnsemble will be called after a H2OFrame dataset is 
creating using Qform, based on specified algorithms in h2olearner
and h2ometalearner. See "Arguments" in tmleCom_Options.
hform.g0 and hform.gstar should also be NULL, in which cases all parent nodes of A node(s) will be used as regressors, or 
a character vector that can be coerced to class "formula". It follows the same rules applied to Qform except it's now based on
gestimator (an argument in tmleCom_Options). Besides, gestimator can be "sl3_pipelines". If gestimator is 
"sl3_pipelines", then learner$train will be called after a sl3 learner has been created by using make_learner. 
See "Arguments" in tmleCom_Options.
Qbounds can be used to bound continuous outcomes Y. If Qbounds not specified (NULL), it will be set to the range of Y, 
widened by 10% of the minimum and maximum. That is, [0.9*min(Y), 1.1*max(Y)]. If specified, then Y will be truncated at the min 
max values of Qbounds, and then scaled to be in [0, 1] by (Y - min(Qbound))/(diff(Qbound)). Statistical inferences  
and for the transformed outcome can be directly translated back into statistical inference for the unscaled outcome. Once Qbounds finish
bounding the observed outcomes, it will be set to (1 - alpha, alpha) and used to bound the predicted values for the initial outcome 
mechanism. Thus, alpha needs to be between (0, 1), otherwise reset to 0.995. Besides, lbound can be used to truncate the weights 
h_gstar/h_gN, that is, [0, 1/lbound].
TMLE.targetStep specifies how to use weights h_gstar/h_gN in the TMLE targeting step. If tmle.intercept (default),   
it performs the weighted intercept-based TMLE that runs a intercept-only weighted logistic regression using offsets logit(Qstar) and  
weights h_gstar/h_gN and so no covariate. If tmle.covariate, it performs the unweighted covariate-based TMLE that run a 
unweighted logistic regression using offsets logit(Qstar) and a clever covariate h_gstar/h_gN.
tmleCommunity returns an object of class "tmleCommunity", which is a named list containing the estimation results
stored by the following 3 elements:
EY_gstar1 - a list with estimates of the mean counterfactual outcome under (deterministic or stochastic) intervention  
function f_gstar1 (E[Y_{g^*_1}]).
EY_gstar2 - a list with estimates of the mean counterfactual outcome under (deterministic or stochastic) intervention   
function f_gstar2 (E[Y_{g^*_2}]); or NULL if f_gstar2 not specified.
ATE - a list with estimates of additive treatment effect (E[Y_{g^*_1}] - E[Y_{g^*_2}]) under two interventions
functions f_gstar1 VS f_gstar2; or NULL if f_gstar2 not specified.
Each element in the returned tmleCommunity object is itself a list containing the following 8 items:
estimates - matrix, 3\times1, storing 3 algorithm estimates of the target parameter (population community-level counterfactual 
mean under (deterministic or stochastic) intervention), including TMLE, IPTW and GCOMP.
vars - matrix, 3\times1, storing 3 influence-curve based asymptotic variance estimates for TMLE, IPTW and 
GCOMP. Notice, all IC-based statistical inference for GCOMP is not accurate (Just for reference). See explanation in IC.
CIs - matrix, 3\times2, storing 3 confidence interval estimates at CI_alpha level, for TMLE, IPTW and 
GCOMP. The first column contains the lower bounds and the second column contains the upper bounds.
tstat - matrix, 3\times1, storing 3 test statistics.
pval - matrix, 3\times1, storing 3 p-values. It's based on the Student's T distribution if the number of communities  
(or the number of individuals if no hierarchical structure) is less than 41, otherwise based on the Z normal distribution.
IC - data frame, nobs\times3, the first column contains the influence curves (ICs) for TMLE estimate, the second column
contains the ICs for IPTW estimate, and the third column contains the ICs for GCOMP estimate (not accurate since it is based on ICs
for TMLE estimate without updating step).  
h.g0_GenericModel - An object of GenericModel R6 class, storing the model fits for P(A | W, E) under  
observed exposure mechanism g0. This can be used in tmleCommunity (See Arguments).
h.gstar_GenericModel - An object of GenericModel R6 class, storing the model fits for P(A | W, E)  
under intervention f_gstar1 or f_gstar2. This can be used in tmleCommunity (See Arguments).
Estimations are based on either community-level or individual-level analysis. Each analysis currently implements 3 estimators:
tmle - Either weighted intercept-based TMLE based on weights h_gstar/h_gN (default choice)
or unweighted covariate-based TMLE based on a covariate h_gstar/h_gN.  
iptw - IPTW (Horvitz-Thompson) estimator based on the TMLE weights h_gstar/h_gN.
gcomp - Maximum likelihood based G-computation substitution estimator.
The IPTW estimator is based on the TMLE weights h_{g^*}(A^*,W,E)/h_{g}(A,W,E), which is equivalent to P_{g^*}(A^*|W,E)/P_{g}(A|W,E) 
and is defined as the the weighted average of the observed outcomes Y. The following algorithm shows a general template of the community-level 
IPTW: 
 As described in the following section ("Modeling P(A | W, E) for covariates (A, W, E)"), the first step is to construct an 
estimator P_{\hat{g}^{c}}(A | W, E) of the density for the common (in j) conditional distribution of A given W, E, that is 
P_{g_0^{c}}(A | W, E) for common (in j) community-level covariates (A, W, E). 
 Implementing the same modeling & fitting algorithm to construct an estimator P_{\hat{g}^{c*}}(A^* | W, E) of the density for  
the common (in j) conditional distribution of A^* given (W, E), that is P_{g_0^{c*}}(A^* | W, E) for common (in j)
community-level covariates (A^*, W, E) where A^* is determined by the user-supplied stochastic intervention f_gstar1 or 
f_gstar2, given the observed baseline covariates (W, E). 
Given observed J independent communities \textbf{O}_j = (E_j, \textbf{W}_j, A_j, Y^c_j: j = 1, ..., J), the IPTW estimator is given by:
ψ^{I}_{IPTW, n}=\frac{1}{J}∑_{j=1}^{J}Y^c_j\frac{P_{\hat{g}^{c*}}(A^*_j|\textbf{W}_j,E_j)}{P_{\hat{g}^{c*}}(A_j|\textbf{W}_j,E_j)}
For individual-level IPTW, it reads the input data as O_{i,j} = (E_j, W_{i,j}, A_j, Y_{i,j}: j = 1, ..., J; i = 1, ..., n_j) and incorporates working model that assumes no covariate interference, weighing each individual within one community by α_{i,j}, where the IPTW estimator is given by:
ψ^{II}_{IPTW, n}=\frac{1}{J}∑_{j=1}^{J}∑_{i=1}^{n_j}α_{i,j}Y_{i,j}\frac{P_{g^{*}}(A^*_j|W_{i,j}, E_j)} {P_{g^{*}}(A_j | W_{i,j}, E_j)}
The TMLE estimator is based on the updated model prediction \bar{Q}^*(A, W, E) and is defined by the G-formula. The following algorithm  
shows a general template of the community-level TMLE:
 The first step is exactly the same as IPTW: construct two density estimators and use the ratio of them as the weights 
P_{g^*}(A^*|W,E)/P_{g}(A|W,E) in the targeting step. 
Construct an initial estimator \hat{\bar{Q}}^c(A | W, E) of the common (in j) conditional distribution of Y^c given (A, W, E) and update \hat{\bar{Q}}^{c*}(A | W, E) for \hat{\bar{Q}}^c(A | W, E) by weights calculated in the first step.
The TMLE estimator is defined as the following substitution estimator:
ψ^{I}_{TMLE, n}=\frac{1}{J}∑_{j=1}^{J}\int_{a}\hat{\bar{Q}}^{c*}(a, \textbf{W}_j, E_j)g^{c*}(a|\textbf{W}_j, E_j)dμ(a)
For individual-level TMLE, its estimator is obtained as:
ψ^{II}_{TMLE,n}=\frac{1}{J}∑_{j=1}^{J}∑_{i=1}^{n_j}α_{i,j}\int_{a}\hat{\bar{Q}}^*(a, W_{i,j}, E_j)g^*(a|W_{i,j}, E_j)dμ(a)
The GCOMP estimator is similar to the the TMLE estimator except it uses the untargeted (initial) model \hat{\bar{Q}}^c(A|W,E) instead of its targeted version \hat{\bar{Q}}^{c*}(A | W, E), for example the community-level GCOMP estimator is given by:
ψ^{I}_{GCOMP,n}=\frac{1}{J}∑_{j=1}^{J}\int_{a}\hat{\bar{Q}}^{c}(a, \textbf{W}_j, E_j)g^{c*}(a|\textbf{W}_j, E_j)dμ(a)
For simplicity (and without loss of generality), we now suppose that there is no hierarchical structure in data and are interested in finding 
an non-parametric estimator of the common (in i) individual-level exposure mechanism g_0(A|W), or the commom multivariate 
joint conditional probability model P_{g_0}(A|W), where the exposures and baseline covariates ((A,W)=(A_i, W_i: i=1,...,n)) denote 
the random variables drawn jointly from distribution H_0(A, W) with denisty h_0(a, w) \equiv g_0(a|w)q_{W,0}(w) and q_{W,0}(W)
denotes the marginal density of the baseline covariates W specified by the regression formula hform.g0
(Notice that an non-parametric estimator of the model P_{g^*_0}(A|W)) is similar, except that now the exposures and baseline covariates 
((A^*,W)=(A^*_i, W_i: i=1,...,n)) are randomly drawn from H^*_0(A, W) with density h^*_0(a, w) \equiv g^*_0(a|w)q_{W,0}(w), 
where A^* is determined by the user-supplied (stochastic) intervention f_gstar1 or f_gstar2 and q_{W,0}(W) denotes
the marginal density of the baseline covariates W specified by the regression formula hform.gstar. Thus, the fitting algorithm
for P_{g^*_0}(A|W)) is equivalent for P_{g_0}(A|W)).
Note that A can be multivariate (i.e., (A(1), ..., A(M))) and each of its components A(m) can be either binary, categorical or continuous. The joint probability model for P(A|W)=P(A(1),...,A(M)|W) can be factorized as a sequence P(A(1)|W) \times P(A(2)|W,A(1)) \times ... \times P(A(M)|W, A(1),...,A(M-1)), where each of these conditional probability models P(A(m)|W,A(1),...,A(m-1)) is fit separately, depending on the type of the m-specific outcome variable A(m). For binary A(m), the conditional probability P(A(m)|W,A(1),...,A(m-1)) will be esimtated by a user-specific library of candidate algorithms, including parametric estimators such as logistic model with only main terms, and data-adaptive estimator such as super-learner algorithms. For continuous (or categorical) A(m), consider a sequence of values δ_1, δ_2,...,δ_{K+1} that span the range of A and define K bins and the corresponding K bin indicators (B_1,...,B_K), in which case each bin indicator B_k \equiv [δ_k, δ_{k+1}) is used as an binary outcome in a seperate user-specific library of candidate algorithms, with predictors given by (W, A(1),..., A(m-1)). That is how the joint probability P(A|W) is factorized into such an entire tree of binary regression models.
For simplicity (and without loss of generality), we now suppose A is univariate (i.e., M=1) and continuous and a general template of an fitting algorithm for P_{g_0}(A|W) is summrized below:
Consider the usual setting in which we observe n independently and identically distributed copies o_i=(w_i, a_i, y_i: i=1,..,n) of the random variable O=(W, A, Y), where the observed (a_i: i=1...,n) are continuous.
 As described above, consider a sequence of K+1 values that span the support of A values into K bin intervals Δ 
= (δ_1, δ_2,...,δ_{K+1}) so that any observed data point a_i belongs to one interval within R, in other words, 
for each possible value a \in A (even if it is not in the observed (a_i:i)), there always exists a k \in {1, ...,K} such  
that δ_{k}≤q a<δ_{k+1}, and the length (bandwidth) of the interval can be defined as bw_{k}=δ_{k+1}-δ_{k}. 
Then let the mapping S(a)\in \{1,2,..,K\} denote a unique index of the indicator in Λ that a falls in, where S(a)=k
if a\in [δ_{k},δ_{k+1}), namely, δ_{S(a)} ≤q a < δ_{S(a)+1}. Moreover, we use b_k to denote a binary 
indicator of whether the observed a belongs to bin k (i.e., b_k\equiv I(S(a)=k) for all k≤q S(a); b_k\equiv
NA for all k>S(a)). This is similar to methods for censored longitudinal data, which code observations as NA (censored or 
missing) once the indicator b_k jumps from 0 to 1. Since a is a realization of the random variable A for one individual, 
the corresponding random binary indicators of whether A belongs to bin k can be denoted by B_k:k=1,..,=K where 
B_k \equiv I(S(A)=k) for all k≤q S(A); B_k\equivNA for all k>S(A).
Then for each k = 1,...,K, a binary nonparametric regression is used to estimate the conditional probability P(B_k=1|B_{k-1}=0,W), which corresponds to the probability of B_k jumping from 0 to 1, given B_{k-1}=0 and tbe baseline covariates W. Note tha for each k, the corresponding nonparametric regression model is fit only among observations that are uncensored (i.e., still at risk of getting B_{k}=1 with B_{k-1}=0). Note the above conditional probability P(B_k=1|B_{k-1}=0,W) is equivalent to P(A\in [δ_{k}, δ_{k+1}) | A≥q δ_{k}, W), which is the probability of A belongs to the interval [δ_{k},δ_{k+1}), conditional on A does not belong to any intervals before [δ_{k}, δ_{k+1}) and W. Then the discrete conditional hazard function for each k is defined as a normalization of the conditional probability using the corresponding interval bandwidth bw_{k}: λ_k(A,W)=\frac{P(B_k=1|B_{k-1}=0,W)}{bw_k}=\frac{P(A\in[δ_{k},δ_{k+1})|A≥q δ_{k+1},W)}{bw_k}
 Finally, for any given observation (a,w), we first find out the interval index k to which a belongs (i.e., 
k=S(a)\in{1,...,K}). Then the discretized conditional density of P(A=a|W=w) can be evaluated by 
λ_k(A, W){\times}[∏_{j=1}^{k-1}(1-λ_j(A, W))], which corresponds to the conditional probability of a 
belongs to the interval [δ_{k},δ_{k+1}) and does not belong to any intervals before, given W.
Note that the choice of the values δ_k (k=1,..,K) implies defining the number and positions of the bins. First, a cross- 
validation selector can be applied to data-adaptively select the candidate number of bins, which minimizes variance and maximizes precision 
(Do not recommend too many bins due to easily violating the positivity assumption). Then, we need to choose the most convenient locations 
(cuttoffs) for the bins (for fixed K). There are 3 alternative methods that use the histogram as a graphical descriptive tool to define 
the bin cutoffs Δ=(δ_1,...,δ_K,δ_{K+1}) for a continuous variable A. In tmleCommunity package, the 
choice of methods bin.method together with the other discretization arguments in function tmleCom_Options() such as nbins  
(total number of bins) and maxNperBin (the maximum number of observations in each bin), can be used to define the values of bin  
cutoffs. See help(tmleCom_Options) for more argument details. 
equal.mass: The default discretization method, equal mass (aka equal area) interval method, set by passing an argument
bin.method="equal.mass" to tmleCom_Options() prior to calling the main function tmleCommunity(). The interval are 
defined by spanning the support of A into non-equal length of bins, each containing (approximately) the same number of observations. 
It is data-adaptive since it tends to be wide where the population density is small, and narrow where the density is large. If nbins 
is NA (or is smaller than n/maxNperBin), nbins will be (re)set to the interger value of n/maxNperBin where 
n is the total number of observations in A, and the default setting of maxNperBin is 500 observations per interval.
This method could identify spikes in the density, but oversmooths in the tails and so could not discover outliers. 
equal.len: equal length interval method, set by passing an argument bin.method="equal.len" to 
tmleCom_Options() prior to calling tmleCommunity(). The intervals are defined by spanning the support of A into 
nbins number of equal length of bins. This method describes the tails of the density and identifies outliers well, but  
oversmooths in regions of high density and so is poor at identifying sharp peaks.
dhist: (named for diagonally cut histogram) Combination of equal-area equal length and equal mass method, set by passing an 
argument bin.method="dhist" to tmleCom_Options() prior to calling tmleCommunity(). For consistency, We choose the 
slope a = 5\times \textrm{IQR}(A) as suggested by Denby and Mallows ("Variations on the Histogram" (2009)). For more details,   
please also see this paper.
tmleCommunity-package for the general description of the package,
tmleCom_Options for additional parameters that control the estimation in tmleCommunity,
DatKeepClass for details about storing, managing, subsetting and manipulating the input data,
indSample.iid.cA.cY_list for an example of a continuous exposure and its estimation,
BinaryOutModel, RegressionClass, GenericModel, MonteCarloSimClass
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 | ## Not run: 
#***************************************************************************************
# Example 1: Hierarchical example, with one binary A and bianry Y 
# True ATE of the community-based treatment is approximately 0.103716
data(comSample.wmT.bA.bY_list)  # load the sample data 
comSample.wmT.bA.bY <- comSample.wmT.bA.bY_list$comSample.wmT.bA.bY
N <- NROW(comSample.wmT.bA.bY)
Qform.corr <- "Y ~ E1 + E2 + W2 + W3 + A" # correct Q form
gform.corr <- "A ~ E1 + E2 + W1"  # correct g
#***************************************************************************************
#***************************************************************************************
# 1.1 Estimating the additive treatment effect (ATE) for two deterministic interventions
# (f_gstar1 = 1 vs f_gstar2 = 0) via community-level / individual-level analysis.
# speed.glm using correctly specified Qform, hform.g0 and hform.gstar;
#***************************************************************************************
# Setting global options that may be used in tmleCommunity(), e.g., using speed.glm
tmleCom_Options(Qestimator = "speedglm__glm", gestimator = "speedglm__glm", maxNperBin = N)
# Community-level analysis without a pooled individual-level regression on outcome
tmleCom_wmT.bA.bY.1a_sglm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = FALSE, 
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
# Examples of estimates under f_gstar1 = 1:
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar1$estimates
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar1$vars
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar1$CIs
# Examples of estimates under f_gstar0 = 0:
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar2$estimates
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar2$vars
tmleCom_wmT.bA.bY.1a_sglm$EY_gstar2$CIs
# Examples of estimates for ATE under f_gstar1 - f_gstar0:
tmleCom_wmT.bA.bY.1a_sglm$ATE$estimates
tmleCom_wmT.bA.bY.1a_sglm$ATE$vars
tmleCom_wmT.bA.bY.1a_sglm$ATE$CIs
head(tmleCom_wmT.bA.bY.1a_sglm$ATE$IC)
# Community-level analysis with a pooled individual-level regression on outcome
tmleCom_wmT.bA.bY.1b_sglm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = TRUE, 
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleCom_wmT.bA.bY.1b_sglm$ATE$estimates
# Individual-level analysis with both individual-level outcome and treatment mechanisms
tmleCom_wmT.bA.bY.2_sglm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "individual_level", communityID = "id", 
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleCom_wmT.bA.bY.2_sglm$ATE$estimates
# Failing to provide communityID will automatically set community.step to "NoCommunity"
tmleCom_wmT.bA.bY.NoC_sglm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "individual_level", communityID = NULL, 
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleCom_wmT.bA.bY.NoC_sglm$ATE$estimates
# Stratification analysis that run separate outcome (exposure) mechanism for each community
# use glm since only around 50 observations per community, speed.glm easily fails
# takes longer time than the tests above since doing 1000 TMLEs (one TMLE per community)
# so set verbose to TRUE to track running progress
tmleCom_Options(Qestimator = "glm__glm", gestimator = "glm__glm", maxNperBin = N)
tmleCom_wmT.bA.bY.str_sglm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "perCommunity", communityID = "id", verbose = TRUE,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleCom_wmT.bA.bY.str_sglm$ATE$estimates
#***************************************************************************************
# 1.2 Same as above but for different Qestimator and gestimator through tmleCom_Options()
# via community-level analysis with a pooled individual-level regression on outcome.
# (See more details in examples in tmleCom_Options())
#***************************************************************************************
# SuperLearner for both outcome and treatment (clever covariate) regressions
# using all parent nodes (of Y and A) as regressors (respectively)
require("SuperLearner")
tmleCom_Options(Qestimator = "SuperLearner", gestimator = "SuperLearner", 
                maxNperBin = N, SL.library = c("SL.glm", "SL.step", "SL.bayesglm"))
tmleCom_wmT.bA.bY.2_SL <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = TRUE, 
                Qform = NULL, hform.g0 = NULL, hform.gstar = NULL)
tmleCom_wmT.bA.bY.2_SL$ATE$estimates
# SuperLearner for outcome regressions and glm for treatment regressions
# using all regressors in the correctly specified Qform and 
# all regressors in the misspecified hform.g0 and hform.gstar
tmleCom_Options(Qestimator = "SuperLearner", gestimator = "glm__glm", 
                maxNperBin = N, SL.library = c("SL.mean", "SL.stepAIC", "SL.bayesglm"))
tmleCom_wmT.bA.bY.2_SL.glm <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = TRUE,
                Qform = NULL, hform.g0 = "A ~ W1", hform.gstar = "A ~ E1 + W2")
tmleCom_wmT.bA.bY.2_SL.glm$ATE$estimates
  
# Speedglm for outcome regressions and sl3 for treatment regressions
# using all regressors in the correctly specified Qform and 
# all regressors in the misspecified hform.g0 and hform.gstar  
tmleCom_Options(Qestimator = "speedglm__glm", gestimator = "sl3_pipelines", maxNperBin = N, 
                sl3_learners = list(glm_fast = sl3::make_learner(sl3::Lrnr_glm_fast), 
                                    xgb = sl3::make_learner(sl3::Lrnr_xgboost),), 
                sl3_metalearner = sl3::make_learner(
                  sl3::Lrnr_optim, loss_function = sl3::loss_loglik_binomial,
                  learner_function = sl3::metalearner_logistic_binomial))
tmleCom_wmT.bA.bY.2_speedglm.sl3 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = TRUE,
                Qform = NULL, hform.g0 = "A ~ W1", hform.gstar = "A ~ E1 + W2")
tmleCom_wmT.bA.bY.2_speedglm.sl3$ATE$estimates  
#***************************************************************************************
# 1.3 Evaluating mean population outcome under static intervention A = 0
# with different community-level and individual-level weight choices 
#***************************************************************************************
tmleCom_Options(Qestimator = "speedglm__glm", gestimator = "speedglm__glm", maxNperBin = N)
# weigh individuals in data equally & weigh community by its number of individuals
tmleCom_wmT.bA.bY.1a_w1 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, 
                obs.wts = "equal.within.pop", community.wts = "size.community", 
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_w1$EY_gstar1$estimates
# same as above but weigh individuals within the same community equally
tmleCom_wmT.bA.bY.1a_w2 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, 
                obs.wts = "equal.within.community", community.wts = "size.community", 
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_w2$EY_gstar1$estimates
# weigh individuals within the same community equally & weigh community equally
tmleCom_wmT.bA.bY.1a_w3 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, 
                obs.wts = "equal.within.community", community.wts = "equal.community", 
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_w3$EY_gstar1$estimates
#***************************************************************************************
# 1.4 Specifying user-supplied stochastic or deterministic intervention function
#***************************************************************************************
# Intervention function that will sample A with probability P(A=1) = prob.val
define_f.gstar <- function(prob.val, rndseed = NULL) {
  eval(prob.val) 
  f.gstar <- function(data, ...) {
    print(paste0("probability of selection: ", prob.val))
    rbinom(n = NROW(data), size = 1, prob = prob.val)
  }
  return(f.gstar)
}
# Stochastically set 50% of the population to A=1 
f.gstar_stoch.0.5 <- define_f.gstar(prob.val = 0.5)
# Deterministically set 100% of the population to A=1 
f.gstar_determ.1 <- define_f.gstar(prob.val = 1)
# Deterministically set 100% of the population to A=0
f.gstar_determ.0 <- define_f.gstar(prob.val = 0)
#***************************************************************************************
# 1.5 Equivalent ways of specifying user-supplied (static) intervention (f_gstar1 = 1)
#***************************************************************************************
# Alternative 1: via intervention functoin that sets every invidual's A to constant 1
tmleCom_wmT.bA.bY.1a_fgtar1 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = f.gstar_determ.1, 
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_fgtar1$EY_gstar1$estimates
# Alternative 2: by simply setting f_gstar1 to 1
tmleCom_wmT.bA.bY.1a_fgtar2 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L,
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_fgtar2$EY_gstar1$estimates
# Alternative 3: by setting f_gstar1 to a vector of 1's of length NROW(data)
tmleCom_wmT.bA.bY.1a_fgtar3 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), 
                f_gstar1 = rep(1L, NROW(comSample.wmT.bA.bY)), 
                community.step = "community_level", communityID = "id")
tmleCom_wmT.bA.bY.1a_fgtar1$EY_gstar1$estimates
#***************************************************************************************
# 1.6 Running exactly the same estimator as 1.1 but using h_gstar/h_gN as a coviariate 
# in the targeting step (default to use weighted intercept-based TMLE)
#***************************************************************************************
# unweighted covariate-based TMLE
tmleCom_wmT.bA.bY.1a_covTMlE <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", pooled.Q = FALSE, 
                TMLE.targetStep = "tmle.covariate",  # default as "tmle.intercept"
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleCom_wmT.bA.bY.1a_covTMlE$ATE$estimates
#***************************************************************************************
# 1.7 Equivalent ways of specifying the regression formulae 
# (if Ynode is specified as "Y" and WEnodes = c("E1", "E2", "W1", "W2", "W3"))
#***************************************************************************************
# For outcome regression, the left side of Qform will be ignored if Ynode is specified,
# with dependent variable being set to Ynode.
Qform1 <- "Y ~ E1 + E2 + W2 + W3 + A" 
Qform2 <- "AnythingIsFine ~ E1 + E2 + W2 + W3 + A" 
Qform3 <- NULL  # since all parent nodes of Y will be used as regressors
# For treatment regressions, if hform.gstar unspecified, it uses the same regression
# formula as hform.g0 does. 
# Alternative 1: specify hform.g0 and hform.gstar respectively
hform.g0 <- "A ~ E1 + E2 + W1"
hform.gstar <- "A ~ E1 + E2 + W1"
# Alternative 2: specify hform.g0 only
hform.g0 <- "A ~ E1 + E2 + W1"
hform.gstar <- NULL
#***************************************************************************************
# 1.8 Equivalent ways of allowing printing status message
#***************************************************************************************
# Controlling the global setting 
options(tmleCommunity.verbose = TRUE)
tmleCom_wmT.bA.bY.print1 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id")
# Alternative: using the verbose argument in tmleCommunity()
tmleCom_wmT.bA.bY.print2 <- 
  tmleCommunity(data = comSample.wmT.bA.bY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("E1", "E2", "W1", "W2", "W3"), f_gstar1 = 1L, f_gstar2 = 0L,
                community.step = "community_level", communityID = "id", verbose = TRUE)
#***************************************************************************************
# Example 2: Non-hierarchical example, with one continuous A and continuous Y 
# True mean population outcome under stochastic intervention (specified below)
# is approximately 3.50856
data(indSample.iid.cA.cY_list)  # load the sample data 
indSample.iid.cA.cY <- indSample.iid.cA.cY_list$indSample.iid.cA.cY
true.shift <- indSample.iid.cA.cY_list$shift.val  # 2
true.truncBD <- indSample.iid.cA.cY_list$truncBD  # 10
N <- NROW(indSample.iid.cA.cY)
Qform.corr <- "Y ~ W1 + W2 + W3 + W4 + A" # correct Q form
gform.corr <- "A ~ W1 + W2 + W3 + W4"  # correct g
#***************************************************************************************
#***************************************************************************************
# 2.1 Specifying stochastic intervention function that could represent the true 
# shifted version of the current treatment mechanism
#***************************************************************************************
define_f.gstar <- function(shift.val, truncBD, rndseed = NULL) {
  shift.const <- shift.val
  trunc.const <- truncBD
  f.gstar <- function(data, ...) {
    print(paste0("shift.const: ", shift.const))
    set.seed(rndseed)
    A.mu <- 0.86 * data[,"W1"] + 0.41 * data[,"W2"] - 0.34 * data[,"W3"] + 0.93 * data[,"W4"]
    untrunc.A <- rnorm(n = nrow(data), mean = A.mu + shift.const, sd = 1)
    r.new.A <- exp(0.8 * shift.const * (untrunc.A - A.mu - shift.const / 3))
    trunc.A <- ifelse(r.new.A > trunc.const, untrunc.A - shift.const, untrunc.A)
    return(trunc.A)
  }
  return(f.gstar)
}
# correctly specified stochastic intervention with true shift value and truncated bound
f.gstar.corr <- define_f.gstar(shift = true.shift, truncBD = true.truncBD)
# Misspecified specified stochastic intervention 
f.gstar.mis <- define_f.gstar(shift = 5, truncBD = 8)
#***************************************************************************************
# 2.2 Estimating mean population outcome under different stochastic interventions
# speed.glm using correctly specified Qform, hform.g0 and hform.gstar
#***************************************************************************************
tmleCom_Options(Qestimator = "speedglm__glm", gestimator = "speedglm__glm", maxNperBin = N)
# correctly specified stochastic intervention 
tmleind_iid.cA.cY_true.fgstar <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = f.gstar.corr,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleind_iid.cA.cY_true.fgstar$EY_gstar1$estimates
# misspecified stochastic intervention
tmleind_iid.cA.cY_mis.fgstar <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = f.gstar.mis,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleind_iid.cA.cY_mis.fgstar$EY_gstar1$estimates
#***************************************************************************************
# 2.3 Same as above but using larger number of Monte-Carlo simulations
# using all parent nodes (of Y and A) as regressors (respectively)
#***************************************************************************************
# A will be sampled 10 times (for a total sample size of NROW(data)*10 under f_gstar1)
tmleind_iid.cA.cY_10MC <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = f.gstar.corr, n_MCsims = 10)
tmleind_iid.cA.cY_10MC$EY_gstar1$estimates
#***************************************************************************************
# 2.4 Running exactly the same estimator as 2.1 but defining different values of bin cutoffs 
#***************************************************************************************
# using equal-length method with 10 bins 
tmleCom_Options(bin.method = "equal.len", nbins = 10, maxNperBin = N)
tmleind_iid.cA.cY_len <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = f.gstar.corr,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleind_iid.cA.cY_len$EY_gstar1$estimates
# using combination of equal-length and equal-mass method with 20 bins 
tmleCom_Options(bin.method = "dhist", nbins = 20, maxNperBin = N)
tmleind_iid.cA.cY_dhist <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = f.gstar.corr,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
tmleind_iid.cA.cY_dhist$EY_gstar1$estimates
#***************************************************************************************
# 2.5 Estimating the additive treatment effect (ATE) for two stochastic interventions
#***************************************************************************************
# Intervention function that will shift A by constant rate (shift.rate)
# (A special case of stochastic intervention with constant shift)
define_f.gstar <- function(shift.rate) {
  eval(shift.rate) 
  f.gstar <- function(data, ...) {
    print(paste0("rate of shift: ", shift.rate))
    data[, "A"] * shift.rate
  }
  return(f.gstar)
}
f.gstar_shift0.8 <- define_f.gstar(shift.rate = 0.8)
f.gstar_shift0.5 <- define_f.gstar(shift.rate = 0.6)
tmleCom_Options(bin.method = "equal.mass", nbins = 5, maxNperBin = N)
tmleind_iid.cA.cY_ATE <- 
  tmleCommunity(data = indSample.iid.cA.cY, Ynode = "Y", Anodes = "A", 
                WEnodes = c("W1", "W2", "W3", "W4"),
                f_gstar1 = f.gstar_shift0.8, f_gstar2 = f.gstar_shift0.5,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr)
# ATE estimates for f_gstar1-f_gstar2:
tmleind_iid.cA.cY_ATE$ATE$estimates
tmleind_iid.cA.cY_ATE$ATE$vars
tmleind_iid.cA.cY_ATE$ATE$CIs   
#***************************************************************************************
# Example 3: Non-Hierarchical example, with one binary A and one rare bianry Y 
# (Independent case-control)  True ATE is approximately 0.012662
data(indSample.iid.bA.bY.rareJ1_list)
indSample.iid.bA.bY.rareJ1 <- indSample.iid.bA.bY.rareJ1_list$indSample.iid.bA.bY.rareJ1
obs.wt.J1 <- indSample.iid.bA.bY.rareJ1_list$obs.wt.J1
Qform.corr <- "Y ~ W1 + W2*A + W3 + W4" # correct Q form
gform.corr <- "A ~ W1 + W2 + W3 + W4"  # correct g
tmleCom_Options(maxNperBin = NROW(indSample.iid.bA.bY.rareJ1))
#***************************************************************************************
#***************************************************************************************
# 3.1 Estimating ATE for f_gstar1 = 1 vs f_gstar2 = 0
# using correct observation weights and correctly specified Qform & gform 
#***************************************************************************************
tmleind_iid.bA.bY_corrWT <- 
  tmleCommunity(data = indSample.iid.bA.bY.rareJ1, Ynode = "Y", Anodes = "A",
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = 1, f_gstar2 = 0,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr,
                obs.wts = obs.wt.J1, verbose = TRUE)
tmleind_iid.bA.bY_corrWT$ATE$estimates["tmle", ]  # 0.01220298, good estimate
#***************************************************************************************
# 3.2 Same as above but not specifying the observation weights
# obs.wts = NULL is equivalent to obs.wts = "equal.within.pop"
#***************************************************************************************
tmleind_iid.bA.bY_misWT <- 
  tmleCommunity(data = indSample.iid.bA.bY.rareJ1, Ynode = "Y", Anodes = "A",
                WEnodes = c("W1", "W2", "W3", "W4"), f_gstar1 = 1, f_gstar2 = 0,
                Qform = Qform.corr, hform.g0 = gform.corr, hform.gstar = gform.corr,
                obs.wts = NULL, verbose = TRUE)
tmleind_iid.bA.bY_misWT$ATE$estimates["tmle", ]  # 0.2466575, bad estimate
## End(Not run)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.