| ate | R Documentation | 
Use the g-formula or the IPW or the double robust estimator to estimate the average treatment effect (absolute risk difference or ratio) based on Cox regression with or without competing risks.
ate(
  event,
  treatment,
  censor = NULL,
  data,
  data.index = NULL,
  estimator = NULL,
  strata = NULL,
  contrasts = NULL,
  allContrasts = NULL,
  times,
  cause = NA,
  se = TRUE,
  iid = (B == 0) && (se || band),
  known.nuisance = FALSE,
  band = FALSE,
  B = 0,
  seed,
  handler = "foreach",
  mc.cores = 1,
  cl = NULL,
  verbose = TRUE,
  ...
)
| event | Outcome model which describes how the probability of experiencing a terminal event depends
on treatment and covariates. The object carry its own call and have a  | 
| treatment | Name of the treatment variable or treatment model which describes how the probability of being allocated to a treatment group depends on covariates. See details section and examples section. | 
| censor | Censoring model which describes how the probability of being censored depends on treatment and covariates. See details section and examples section. | 
| data | [data.frame or data.table] Data set in which to evaluate risk predictions based on the outcome model | 
| data.index | [numeric vector] Position of the observation in argument data relative to the dataset used to obtain the argument event, treatment, censor. Only necessary for the standard errors when computing the Average Treatment Effects on a subset of the data set. | 
| estimator | [character] The type of estimator used to compute the average treatment effect. 
Can be  | 
| strata | [character] Strata variable on which to compute the average risk. Incompatible with treatment. Experimental. | 
| contrasts | [character vector] levels of the treatment variable for which the risks should be assessed and compared. Default is to consider all levels. | 
| allContrasts | [2-row character matrix] levels of the treatment variable to be compared. Default is to consider all pairwise comparisons. | 
| times | [numeric vector] Time points at which to evaluate average treatment effects. | 
| cause | [integer/character] the cause of interest. | 
| se | [logical] If  | 
| iid | [logical] If  | 
| known.nuisance | [logical] If  | 
| band | [logical] If  | 
| B | [integer, >0] the number of bootstrap replications used to compute the confidence intervals. If it equals 0, then the influence function is used to compute Wald-type confidence intervals/bands. | 
| seed | [integer, >0] sed number used to generate seeds for bootstrap and to achieve reproducible results. | 
| handler | [character] Parallel handler for bootstrap. 
 | 
| mc.cores | [integer, >0] The number of cores to use,
i.e., the upper limit for the number of child processes that run simultaneously.
Passed to  | 
| cl | A parallel socket cluster used to perform cluster calculation in parallel (output by  | 
| verbose | [logical] If  | 
| ... | passed to predictRisk | 
Argument event:
IPTW estimator: a character vector indicating the event and status variables.
 G-formula or AIPTW estimator: a survival model (e.g. Cox "survival::coxph", Kaplan Meier "prodlim::prodlim"),
a competing risk model (e.g. cause-specific Cox "riskRegression::CSC"),
a logistic model (e.g. "stats::glm" when argument censor is NULL otherwise "riskRegression::wglm").
Other models can be specified, provided a suitable predictRisk method exists, but the standard error should be computed using non-parametric bootstrap,
as the influence function of the estimator will typically not be available.
Argument treatment:
G-formula estimator: a character indicating the treatment variable.
 IPTW or AIPTW estimator: a "stats::glm" model with family "binomial" (two treatment options) or a "nnet::multinom" (more than two treatment options).
Argument censor:
G-formula estimator: NULL
 IPTW or AIPTW estimator: NULL if no censoring and otherwise a survival model (e.g. Cox "survival::coxph", Kaplan Meier "prodlim::prodlim") 
Argument estimator: when set to "AIPTW" with argument event being IPCW logistic model ("riskRegression::wglm"), the integral term w.r.t. to the martingale of the censoring process is not computed,
i.e. using the notation of Ozenne et al. (2020) an AIPTW,IPCW estimator instead of an AIPTW,AIPCW is evaluated. 
In presence of censoring, the computation time and memory usage for the evaluation of the AIPTW estimator and its uncertainty do not scale well with the number of observations (n) or the number of unique timepoints (T). Point estimation involves n by T matrices, influence function involves n by T by n arrays.
for large datasets (e.g. n>5000), bootstrap is recommended as the memory need for the influence function is likely prohibitive.
 it is possible to decrease the memory usage for the point estimation by setting the (hidden) argument store=c(size.split=1000). The integral term of the AIPTW estimator is then evaluated for 1000 observations at a time, i.e. involing matrices of size 1000 by T instead of n by T. This may lead to increased computation time.
reducing the number of unique timepoints (e.g. by rounding them) will lead to an approximate estimation procedure that is less demanding both in term of computation and memory. The resulting estimator will be more variable than the one based on the original timepoints (i.e. wider confidence intervals).
Brice Ozenne broz@sund.ku.dk and Thomas Alexander Gerds tag@biostat.ku.dk
Brice Maxime Hugues Ozenne, Thomas Harder Scheike, Laila Staerk, and Thomas Alexander Gerds. On the estimation of average treatment effects with right- censored time to event outcome and competing risks. Biometrical Journal, 62 (3):751–763, 2020.
autoplot.ate for a graphical representation the standardized risks. 
coef.ate to output estimates for the the average risk, or difference in average risks, or ratio between average risks. 
confint.ate to output a list containing all estimates (average & difference & ratio) with their confidence intervals and p-values. 
model.tables.ate to output a data.frame containing one type of estimates (average or difference or ratio) with its confidence intervals and p-values.   
summary.ate for displaying in the console a summary of the results.
library(survival)
library(prodlim)
library(data.table)
############################
#### Survival settings  ####
#### ATE with Cox model ####
############################
#### generate data ####
n <- 100
set.seed(10)
dtS <- sampleData(n, outcome="survival")
dtS$time <- round(dtS$time,1)
dtS$X1 <- factor(rbinom(n, prob = c(0.3,0.4) , size = 2), labels = paste0("T",0:2))
##### estimate the survival model ####
## Cox proportional hazard model
fit.Cox <- coxph(Surv(time,event)~ X1+X2, data=dtS, y=TRUE, x=TRUE)
## Kaplan Meier - same as copxh(Surv(time,event)~ strata(X1)+strata(X2), ties = "breslow")
fit.KM <- prodlim(Hist(time,event)~ X1+X2, data=dtS)
##### compute the ATE ####
## at times 5, 6, 7, and 8 using X1 as the treatment variable
## standard error computed using the influence function
## confidence intervals / p-values based on asymptotic results
ateFit1a <- ate(fit.Cox, treatment = "X1", times = 5:8, data = dtS)
summary(ateFit1a)
model.tables(ateFit1a, type = "meanRisk")
model.tables(ateFit1a, type = "diffRisk")
model.tables(ateFit1a, type = "ratioRisk")
## relaxing PH assumption using a stratified model
ateFit1a.NPH <- ate(fit.KM, treatment = "X1", times = 5:8, data = dtS,
                    product.limit = FALSE)
model.tables(ateFit1a.NPH, times = 5) ## no PH assumption
model.tables(ateFit1a, times = 5)  ## PH assumption
## Not run: 
#### ATE with confidence bands ####
## same as before with in addition the confidence bands / adjusted p-values
## (argument band = TRUE)
ateFit1b <- ate(fit.Cox, treatment = "X1", times = 5:8, data = dtS, band = TRUE)
summary(ateFit1b)
## by default bands/adjuste p-values computed separately for each treatment modality
summary(ateFit1b, band = 1,
        se = FALSE, type = "diffRisk", short = TRUE, quantile = TRUE)
## adjustment over treatment and time using the band argument of confint
summary(ateFit1b, band = 2,
       se = FALSE, type = "diffRisk", short = TRUE, quantile = TRUE)
## End(Not run)
## Not run: 
#### ATE with non-parametric bootstrap ####
## confidence intervals / p-values computed using 1000 bootstrap samples
## (argument se = TRUE and B = 1000) 
ateFit1c <- ate(fit.Cox, treatment = "X1", times = 5:8, data = dtS,
                se = TRUE, B = 50, handler = "mclapply")
## NOTE: for real applications 50 bootstrap samples is not enough 
## same but using 2 cpus for generating and analyzing the bootstrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit1d <- ate(fit.Cox, treatment = "X1", times = 5:8, data = dtS, 
                se = TRUE, B = 50, mc.cores = 2)
## manually defining the cluster to be used
## useful when specific packages need to be loaded in each cluster
fit.CoxNL <- coxph(Surv(time,event)~ X1+X2+rcs(X6),data=dtS,y=TRUE,x=TRUE)
cl <- parallel::makeCluster(2)
parallel::clusterEvalQ(cl, library(rms))
ateFit1e <- ate(fit.CoxNL, treatment = "X1", times = 5:8, data = dtS, 
                se = TRUE, B = 50, handler = "foreach", cl = cl)
parallel::stopCluster(cl)
## End(Not run)
################################################
#### Competing risks settings               ####
#### ATE with cause specific Cox regression ####
################################################
#### generate data ####
n <- 500
set.seed(10)
dt <- sampleData(n, outcome="competing.risks")
dt$X1 <- factor(rbinom(n, prob = c(0.2,0.3) , size = 2), labels = paste0("T",0:2))
#### estimate cause specific Cox model ####
fit.CR <-  CSC(Hist(time,event)~ X1+X8,data=dt,cause=1)
#### compute the ATE ####
## at times 1, 5, 10
## using X1 as the treatment variable
ateFit2a <- ate(fit.CR, treatment = "X1", times = c(1,5,10), data = dt, 
                cause = 1, se = TRUE, band = TRUE)
summary(ateFit2a)
data.table::as.data.table(ateFit2a)
#############################################
#### Survival settings without censoring ####
#### ATE with glm                        ####
#############################################
#### generate data ####
n <- 100
dtB <- sampleData(n, outcome="binary")
dtB[, X2 := as.numeric(X2)]
##### estimate a logistic regression model ####
fit.glm <- glm(formula = Y ~ X1+X2, data=dtB, family = "binomial")
#### compute the ATE ####
## using X1 as the treatment variable
## only point estimate (argument se = FALSE)
ateFit1a <- ate(fit.glm, treatment = "X1", data = dtB, se = FALSE)
ateFit1a
## Not run: 
## with confidence intervals
ateFit1b <- ate(fit.glm, data = dtB, treatment = "X1",
               times = 5) ## just for having a nice output not used in computations
summary(ateFit1b, short = TRUE)
## using the lava package
library(lava)
ateLava <- estimate(fit.glm, function(p, data){
a <- p["(Intercept)"] ; b <- p["X11"] ; c <- p["X2"] ;
R.X11 <- expit(a + b + c * data[["X2"]])
R.X10 <- expit(a + c * data[["X2"]])
list(risk0=R.X10,risk1=R.X11,riskdiff=R.X11-R.X10)},
average=TRUE)
ateLava
## End(Not run)
############################
#### Survival settings  ####
#### ATE with glm       ####
############################
## see wglm for handling right-censoring with glm
#################################
#### Double robust estimator ####
#################################
## Not run: 
## generate data
n <- 500
set.seed(10)
dt <- sampleData(n, outcome="competing.risks")
dt$X1 <- factor(rbinom(n, prob = c(0.4) , size = 1), labels = paste0("T",0:1))
## working models
m.event <-  CSC(Hist(time,event)~ X1+X2+X3+X5+X8,data=dt)
m.censor <-  coxph(Surv(time,event==0)~ X1+X2+X3+X5+X8,data=dt, x = TRUE, y = TRUE)
m.treatment <-  glm(X1~X2+X3+X5+X8,data=dt,family=binomial(link="logit"))
## prediction + average
ateRobust <- ate(event = m.event,
                 treatment = m.treatment,
                 censor = m.censor,
                 data = dt, times = 5:10, 
                 cause = 1)
summary(ateRobust)
## compare various estimators
system.time( ## about 1.5s
ateRobust2 <- ate(event = m.event,
                 treatment = m.treatment,
                 censor = m.censor,
                 estimator = c("GFORMULA","IPTW","AIPTW"),
                 data = dt, times = c(5:10), 
                 cause = 1, se = TRUE)
)
data.table::as.data.table(ateRobust2, type = "meanRisk")
data.table::as.data.table(ateRobust2, type = "diffRisk")
## reduce memory load by computing the integral term
## on only 100 observations at a time (only relevant when iid = FALSE)
ateRobust3 <- ate(event = m.event,
                 treatment = m.treatment,
                 censor = m.censor,
                 data = dt, times = 5:10, 
                 cause = 1, se = FALSE,
                 store = c("size.split" = 100), verbose = 2)
coef(ateRobust3) - coef(ateRobust) ## same
## approximation to speed up calculations
dt$time.round <- round(dt$time/0.5)*0.5 ## round to the nearest half
dt$time.round[dt$time.round==0] <- 0.1 ## ensure strictly positive event times
mRound.event <-  CSC(Hist(time.round,event)~ X1+X2+X3+X5+X8,data=dt)
mRound.censor <-  coxph(Surv(time.round,event==0)~ X1+X2+X3+X5+X8,data=dt, x = TRUE, y = TRUE)
system.time( ## about 0.4s
ateRobustFast <- ate(event = mRound.event, treatment = m.treatment,
                    censor = mRound.censor,
                 estimator = c("GFORMULA","IPTW","AIPTW"), product.limit = FALSE,
                 data = dt, times = c(5:10), cause = 1, se = TRUE)
)
coef(ateRobustFast) - coef(ateRobust) ## inaccuracy
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.