Numerical simulation for treatment effect heterogeneity estimation as described in Tian et al. (2012)

number of observations. |

number of predictors. |

covariance between predictors. |

multiplier of error term. |

size of main effects relative to interaction effects. See details. |

`sim_pte`

simulates data according to the following specification:

*Y = I(∑_{j=1}^p β_{j}X_{j} + ∑_{j=1}^p γ_{j}X_{j}T +σ_{0}ε > 0)*

where *γ=(1/2,-1/2,1/2,-1/2, 0,...,0)*, *β=(-1)^{j+1}I(3 ≤q j ≤q 10) / \code{beta.den}*, *(X_{1}, …, X_{p})* follows a mean zero multivariate normal distribution with a compound symmetric
variance-covariance matrix, *(1-ρ)\mathbf{I}_{p} +ρ \mathbf{1}^{T}\mathbf{1}*, *T=[-1,1]* is the treatment indicator and *ε* is *N(0,1)*.

In this case, the "true" treatment effect score *(Prob(Y=1|T=1) - Prob(Y=1|T=-1))* is given by

*Φ (\frac{∑_{j=1}^p (β_{j} + γ_{j})X_{j}}{σ_{0}}) - Φ (\frac{∑_{j=1}^p (β_{j} - γ_{j})X_{j}}{σ_{0}})*

A data frame including the response variable (*Y*), the treatment (`treat=1`

) and control (`treat=-1`

) assignment, the predictor variables (*X*) and the "true" treatment effect score (`ts`

Leo Guelman <leo.guelman@gmail.com>

Tian, L., Alizadeh, A., Gentles, A. and Tibshirani, R. 2012. A simple method for detecting interactions between a treatment and a large number of covariates. Submitted on Dec 2012. arXiv:1212.2995 [stat.ME].

Guelman, L., Guillen, M., and Perez-Marin A.M. (2013). Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. *Submitted*.

library(uplift)
### Simulate train data
set.seed(12345)
dd <- sim_pte(n = 1000, p = 10, rho = 0, sigma = sqrt(2), beta.den = 4)
dd$treat <- ifelse(dd$treat == 1, 1, 0) # required coding for upliftRF
### Fit model
form <- as.formula(paste('y ~', 'trt(treat) +', paste('X', 1:10, sep = '', collapse = "+")))
fit1 <- upliftRF(formula = form,
data = dd,
ntree = 100,
split_method = "Int",
interaction.depth = 3,
minsplit = 100,
minbucket_ct0 = 50,
minbucket_ct1 = 50,
verbose = TRUE)
summary(fit1)
