comSample.wmT.bA.bY_list: An Example of a Hierarchical Data Containing a Cluster-Based...

Description Usage Format Source Examples

Description

Simulated hierarchical dataset containing 1000 independent communities, each (community j) containing n_j (non-fixed) number of individuals where n_j is drawn from a normal with mean 50 and standard deviation 10 and round to the nearest integer. Each row (observation) includes 2 measured community-level baseline covariates (E1, E2), 3 dependent individual-level baseline covariates (W1, W2, W3), 1 dependent bianry exposure (A) and 1 dependent binary outcoem (Y), along with one unique community identifier (id). The community-level baseline covariates (E1, E2) were sampled as i.i.d across all communities, while the individual-level baseline covariates (W1, W2, W3) for each individual i within communty j was generated conditionally on the values of j's community-level baseline covariates (E1[j], E2[j]). Then the community-level exposure (A) for each community j was sampled conditionally on the value of j's community-level baseline covariates (E1[j], E2[j]), together with all invididuals' baseline covariates (W1[i], W2[i], W3[i]) within community j where i=1,..,n_j. Similary, the individual-level binary outcome Y for each individual i within communty j was sampled conditionally covariates and exposure (E1[j], E2[j], A[j]), as well as the value of individual i's baseline covariates on the value of community j's baseline (W1[i], W2[i], W3[i]). The following section provides more details regarding individual variables in simulated data.

Usage

1

Format

A data frame with 1000 independent communities, each containing around 50 individuals (in total 50,457 observations (rows)), and 8 variables (columns):

id

integer (unique) community identifier from 1 to 1000, identical within the same community

E1

continuous uniform community-level baseline covariate with min=0 and max=1 (independent and identical across all individuals in the same community)

E2

discrete uniform community-level baseline covariate with 5 elements (0, 0.2, 0.4, 0.8, 1) (independent and identical across all individuals in the same community)

W1

binary individual-level baseline covariate that depends on the values of community-level baseline covaries (E1,E2)

W2

continuous individual-level baseline covariate, together with W3, are drawn from a bivariate normal distribution with correlation 0.6, depending on the values of community's baseline covaries (E1, E2)

W3

continuous normal individual-level baseline covariate, correlated with W2, see details in above

A

binary exposure that depends on community's baseline covariate values in (E1, E2), and the mean of all individuals' baseline covariates W1 within the same community

Y

binary outcome that depends on community's baseline covariate and exposure values in (E1, E2, A), and all individuals' baseline covariate values in (W2, W3)

Source

https://github.com/chizhangucb/tmleCommunity/blob/master/tests/dataGeneration/get.cluster.dat.Abin.R

Examples

1
2
3
4
5
6
data(comSample.wmT.bA.bY_list)
comSample.wmT.bA.bY <- comSample.wmT.bA.bY_list$comSample.wmT.bA.bY
head(comSample.wmT.bA.bY)
comSample.wmT.bA.bY_list$psi0.Y  # 0.103716, True ATE
# summarize the number of individuals within each community
head(table(comSample.wmT.bA.bY$id))  

chizhangucb/tmleCommunity documentation built on May 20, 2019, 3:34 p.m.