Description Usage Arguments Details Value Examples
When investigating the properties of GEM, the following three data generators are used in various simulations. They are designed to construct three specific types of data sets in the case of two treatment groups. See more detail in E Petkova, T Tarpey, Z Su, and RT Ogden. Generated effect modifiers (GEMs) in randomized clinical trials. Biostatistics, (First published online: July 27, 2016). doi: 10.1093/biostatistics/kxw035.
1 2 3 4 5 | data_generator1(d, R2, v2, n, co, beta1, inter)
data_generator2(n, co, R2, bet, inter)
data_generator3(n, co, bet, inter)
|
d |
A scalar indicating the effect size of the GEM when the data is generated under a GEM model |
R2 |
A scalar indicating the proportion of explained variance R^2 for the entire data set |
v2 |
A scalar indicating the proportion of explained variance R^2 for the first treatment group |
n |
A scalar indicating the number of observation in each treatment group, assumed to be the same. |
co |
A p by p positive semidefinite matrix indicating the covariance matrix of the covariates |
beta1 |
A vector of length p giving the regression coefficients for the first treatment group |
inter |
A vector of length 2 recording the intercepts β_{10},β_{20} for the two treatment groups respectively |
bet |
A list with two elements, each a vector of length p, giving the regression coefficients for the two treatment groups respectively |
data_generator1
is used to create data where the outcome is a linear function of the covariates
y_j = β_{j0} + Xβ_j + ε, j = 1, 2,
and the coffcicients of covariates β are proportional between two treatment groups: β_2 = b * β_1. This type of data set matches perfectly with the motivation of GEM algorithm. β_1 is set as an argument of the function while β_2 = b * β_1 is derived by controling R^2 of the whole data and the effect size. See more detail in Kraemer, H. C. (2013). Discovering, comparing, and combining moderators of treatment on outcome after randomized clinical trials: a parametric approach. Statistics in medicine, 32(11), 1964-1973.
data_generator2
is similar to the first one except that the coefficients of the covariates are not necessarily proportional. Hence two \bold{β}'s
should be specified as arguments of the function.
data_generator3
constructs a data set where the outcome under each treatment condition is given for all subjects. In addition, no error is added to the mean outcome.
This generator is useful for obtaining the "true" value of a treatment decision. This data generator is similar to data generator2
y_j = β_{j0} + Xβ_j, j = 1,2.
The output from these functions are different:
For the function data_generator1
dat
A data frame with first and second column as treatment group index and outcome respectively,
and each of the remaining columns as a covariate.
bet
A list with two elements, each a vector of length p, giving the regression coefficients for the two treatment groups respectively
error_12
A vector of length three represeting the standard deviation of ε, the explained variance by the linear part for the first
and second treatment group respectively.
For the function data_generator2
dat
A data frame with first and second column as treatment group index and outcome respectively,
and each of the remaining columns as a covariate.
bet
list with two elements, each a vector of length p, giving the regression coefficients for the two treatment groups respectively
error
A scalar represeting the standard deviation of ε
For the function data_generator3
y0
Outcome vector under the first treatment assignment
y1
Outcome vector under the second treatment assignment
X
Design matrix for the covariates
oracle
Average of the outcome if each subject takes the optimal treatment assignment
invOracle
Average of the outcome if each subject does not take the optimal treatment assignment
1 2 3 4 5 6 7 8 9 10 11 | #constructing the covariance matrix
co <- matrix(0.2, 30, 30)
diag(co) <- 1
dataEx <- data_generator1(d = 0.3, R2 = 0.5, v2 = 1, n = 3000,
co = co, beta1 = rep(1,30),inter = c(0,0))
#check the R squared of the simluated data set
dat <- dataEx[[1]]
summary(lm(V2~factor(trt)*(V3+V4+V5+V6+V7+V8+V9+V10+V11+V12+V13+V14+V15+V16+
V17+V18+V19+V20+V21+V22+V23+V24+V25+V26+V27+V28+V29+V30+V31+V32),data=dat))
bigData <- data_generator3(n = 10000,co = co,bet =dataEx[[2]], inter = c(0,0))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.