Description Details References Datasets Updates
Targeted Maximum Likelihood Estimation (TMLE) of the average causal effect of community-based intervention(s) at a single time point on an individual-based outcome of interest. (and can be extended to additive treatment effect). In other words, it estimates the marginal treatment effect of single-time point arbitrary interventions on a continuous or binary outcome in community-independent data, adjusting for both community-level and individual-level baseline covariates. The package also provides Inverse-Probability-of-Treatment-Weighted estimator (IPTW) and parametric G-computation formula estimator (GCOMP). The statistical inference (Standard errors, t statistc, p-value and confidence intervals) of both TMLE and IPTW are based on the corresponding influence curve, respectively. Optional data-adaptive estimation of exposure and outcome mechanisms using the SuperLearner package and h2o package (latter for a large dataset) is strongly recommended, especially when the outcome mechanism and treatment mechnism are unknown. Besides, it allows for panel data transformation, such as with random effects and fixed effects.
The input dataset should be made up of rows of community-specific and individual-specific observations, for community j, each
row i includes random variables (W_{i,j}, E_{j}, A_{j}, Y_{i,j}), where E_j represents a vector of community
j's community-level (environmental) baseline covariates (individuals within the same community share the same values of
E_j), W_{i,j} represents a vector of individual i's individual-level baseline covariates, A_j is the
exposure(s) (can be univariate or multivariate, can be binary, categorical or continuous) assigned or naturally occurred in
community j (individuals within the same community receive the same value of A_j) and Y_{i,j} is i's
outcome (either binary or continuous). Each individual's baseline covariates (W_{i,j} depends on the environmental
baseline covariates E_j of the community j to which i belongs to. Similarly, each community's exposure
A_j depends on its community-level baseline covariates E_j and individual-level baseline covariates of all
individuals belonging to community j (all W_{i,j} such that i belongs to j). Besides, each outcome
Y_{i,j} could be affected by its baseline community and individual-level covariates (E_j, W_{i,j}) and the baseline
covariates of other individuals within the same community (W_{s,j}: s\neq i, s\in j), together with its community-based
intervention A_j. We note that the input data with no hierarchical structure (i.e., no communities and only individuals)
is a special case of the hierarchical data since it simply treats E_j as NULL
.
There are currently three approaches that can be used in hierarchical data analysis. The first community-level TMLE is developed under a non-parametric causal model that allows for arbitrary interactions between individuals within a community. It estimates the community-level causal effect by aggregating data at a community-level and treating community rather than the individual as the unit of analysis (i.e., both community-level outcome and treatment mechanisms). The second individual-level TMLE is developed under the submodel of the causal model in the first approach, incoporating knowledge of the dependence structure between individual within communities (i.e., both individual-level outcome and treatmnet mechanisms). The third stratified TMLE fits a separate outcome (exposure) mechanism for each community, and then combine those estimates into a (user-specific) average (Default to be community size-weighed). Note that the stratified TMLE naturally controls for the community-level observed covariates and unobserved factors. Namely, there is no E in the regressors for both outcome and treatment mechanisms.
Balzer L. B., Zheng W., van der Laan M. J., Petersen M. L. and the SEARCH Collaboration (2017). A New Approach to Hierarchical Data Analysis: Targeted Maximum Likelihood Estimation of Cluster-Based Effects Under Interference. ArXiv e-prints. 1706.02675.
Mu\~noz, I. D. and van der Laan, M. (2012). Population Intervention Causal Effects Based on Stochastic Interventions. Biometrics, 68(2):541-549.
Sofrygin, O. and van der Laan, M. J. (2015). tmlenet: Targeted Maximum Likelihood Estimation for Network Data. R package version 0.1.9. https://github.com/osofr/tmlenet
van der Laan, M. (2014). Causal Inference for a Population of Causally Connected Units. Journal of Causal Inference, 2(1)
van der Laan, Mark J. and Gruber, Susan (2011). "Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome". U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 290. http://biostats.bepress.com/ucbbiostat/paper290
van der Laan, Mark J. and Rose, Sherri, "Targeted Learning: Causal Inference for Observational and Experimental Data" New York: Springer, 2011.
To learn more about the type of data input required by tmleCommunity
, see the following example datasets:
comSample.wmT.bA.bY_list
indSample.iid.cA.cY_list
indSample.iid.bA.bY.rareJ1_list
indSample.iid.bA.bY.rareJ2_list
For R code that can simulate more data with different structures, please check
https://github.com/chizhangucb/tmleCommunity/tree/master/tests/dataGeneration
Check for updates and report bugs at https://github.com/chizhangucb/tmleCommunity.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.