Description Usage Format Details References
A dataset simulated as in Tan (2020), Section 4.
1 |
A data matrix with 800 rows and 202 columns.
The dataset is generated as follows, where y
, tr
, and x
represent an outcome, a treatment, and covariates respectively.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | library(MASS)
###
mt0 <- 1-pnorm(-1)
mt1 <- dnorm(-1)
mt2 <- -(2*pnorm(-1)-1)/2 - dnorm(-1) +1/2
mt3 <- 3*dnorm(-1)
mt4 <- -3/2*(2*pnorm(-1)-1) - 4*dnorm(-1) +3/2
m.z1 <- mt0 + 2*mt1 + mt2
v.z1 <- mt0 + 4*mt1 + 6*mt2 + 4*mt3 + mt4
v.z1 <- v.z1 + 1 + 2*(mt1 + 2*mt2 + mt3)
sd.z1 <- sqrt(v.z1 -m.z1^2)
###
set.seed(123)
n <- 800
p <- 200
noise <- rnorm(n)
covm <- matrix(1,p,p)
for (i1 in 1:p)
for (i2 in 1:p) {
covm[i1,i2] <- 2^(-abs(i1-i2))
}
x <- mvrnorm(n, mu=rep(0,p), Sigma=covm)
# transformation
z <- x
for (i in 1:4) {
z[,i] <- ifelse(x[,i]>-1,x[,i]+(x[,i]+1)^2,x[,i])
z[,i] <- (z[,i]-m.z1) /sd.z1 # standardized
}
# treatment
eta <- 1+ c( z[,1:4] %*% c(1, .5, .25, .125) )
tr <- rbinom(n, size=1, prob=expit(eta))
# outcome
eta.y <- c( z[,1:4] %*% c(1, .5, .25, .125) )
y <- eta.y + noise
# save; if using main effects of x, then both the propensity score
# and outcome regression models are misspecified
simu.data <- cbind(y, tr, x)
save(simu.data, file="simu.data.rda")
|
Tan, Z. (2020) Model-assisted inference for treatment effects using regularized calibrated estimation with high-dimensional data, Annals of Statistics, 48, 811<e2><80><93>837.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.