rcate.ml: Robust estimation of treatment effect.
In rhli-Hannah/RCATE: Robust Estimation of Heterogeneous Treatment Effect

Description Usage Arguments Details Value Examples

rcate.ml fit ML algorithm for robust treatment effect estimation.

rcate.ml(
  x,
  y,
  d,
  method = "MCMEA",
  algorithm = "GBM",
  n.trees.p = 40000,
  shrinkage.p = 0.005,
  n.minobsinnode.p = 10,
  interaction.depth.p = 1,
  cv.p = 5,
  n.trees.mu = c(1:50) * 50,
  shrinkage.mu = 0.01,
  n.minobsinnode.mu = 5,
  interaction.depth.mu = 5,
  cv.mu = 5,
  n.trees.gbm = 1000,
  interaction.depth.gbm = 2,
  n.cells.nn = NA,
  dropout.nn = NA,
  epochs.nn = 100
)

`x`	matrix or a data frame of predictors.
`y`	vector of response values.
`d`	vector of binary treatment assignment (0 or 1).
`method`	character string of CATE estimation method: "MCMEA" - modified co-variate method with efficiency augmentation, "RL" - R-learning, or "DR" - doubly robust method.
`algorithm`	character string of algorithm: "GBM" - gradient boosting machine or "NN" - neural network. The random forests is available in rcate.rf.
`n.trees.p`	tuning parameter the number of trees used for estimating propensity score with GBM. the default value is 40000.
`shrinkage.p`	tuning parameter the shrinkage level for estimating propensity score with GBM. the default value is 0.005.
`n.minobsinnode.p`	tuning parameter the minimum node size for estimating propensity score with GBM. the default value is 10.
`interaction.depth.p`	tuning parameter the number of interactions for estimating propensity score with GBM. the default value is 1.
`cv.p`	tuning parameter the number of folds in cross-validation for estimating propensity score with GBM. the default value is 2.
`n.trees.mu`	scalar or vector of the number of trees for estimating mean function with GBM. The default is (1:50)*50.
`shrinkage.mu`	tuning parameter the shrinkage level for estimating mean function with GBM. the default value is 0.01.
`n.minobsinnode.mu`	tuning parameter the minimum node size for estimating mean function with GBM. the default value is 10.
`interaction.depth.mu`	tuning parameter the number of interactions for estimating mean function with GBM. the default value is 5.
`cv.mu`	tuning parameter the folds for cross-validation for estimating mean function with GBM. The default value is 5.
`n.trees.gbm`	tuning parameter the number of trees used in GBM for estimating treatment effect function if algorithm="GBM". The default is 1000.
`interaction.depth.gbm`	tuning parameter the number of interactions for estimating treatment effect function if algorithm="GBM". The default value is 2.
`n.cells.nn`	vector of the number of neurals in each hidden layer if algorithm='NN'. The default is two layers with each layer the half size of previous layer.
`dropout.nn`	vector of the dropout rate of each hidden layer if algorithm='NN'. The default is no dropout.
`epochs.nn`	scalar of the number of epochs for neural network if algorithm='NN'. The defualt is 100.

Fit a GBM or NN to estimate treatment effect estimation that robust to outliers.

a list of components

model - the robust estimation model of CATE.
method - estimation method.
algorithm - fitting algorithm.
fitted.values - vector of fitted values.
x - matrix of predictors.
y - vector of response values.
d - vector of treatment assignment.
y.tr - vector of transformed outcome.
w.tr - vector of transformed weight.
n.trees.gbm - number of trees for estimating treatment effect function if algorithm='GBM'.
history - model fitting history.
importance - variable importance level.
param - required parameters for utility functions.

n <- 1000; p <- 3; set.seed(2223)
X <- as.data.frame(matrix(runif(n*p,-3,3),nrow=n,ncol=p))
tau = 6*sin(2*X[,1])+3*(X[,2]+3)*X[,3]
p = 1/(1+exp(-X[,1]+X[,2]))
d = rbinom(n,1,p)
t = 2*d-1
y = 100+4*X[,1]+X[,2]-3*X[,3]+tau*t/2 + rnorm(n,0,1); set.seed(2223)
x_val = as.data.frame(matrix(rnorm(200*3,0,1),nrow=200,ncol=3))
tau_val = 6*sin(2*x_val[,1])+3*(x_val[,2]+3)*x_val[,3]
# Use R-learning method and GBM to estimate CATE
fit <- rcate.ml(X,y,d,method='RL',algorithm='GBM')
y_pred <- predict(fit,x_val)$predict
plot(tau_val,y_pred);abline(0,1)

# Use doubly robust method and neural network to estimate CATE
fit <- rcate.ml(X,y,d,method='DR',algorithm='NN',dropout.nn=c(0),n.cells.nn=c(3,3))