ate: Compute the Average Treatment Effects Via the G-formula
In bozenne/riskRegressionLight: Light version of the package riskRegression

Description Usage Arguments Author(s) See Also Examples

Use the g-formula to estimate the average treatment effect based on Cox regression with or without competing risks.

ate(object, data, formula, treatment, strata = NULL, contrasts = NULL,
  times, cause, landmark, se = TRUE, iid = FALSE, band = FALSE,
  B = 0, confint = (se + band) > 0, seed, handler = "foreach",
  mc.cores = 1, cl = NULL, verbose = TRUE, store.iid = "full", ...)

`object`	Outcome model which describes how event risk depends on treatment and covariates. The object carry its own call and have a `predictRisk` method. See examples.
`data`	[data.frame or data.table] Data set in which to evaluate risk predictions based on the outcome model
`formula`	For analyses with time-dependent covariates, the response formula. See examples.
`treatment`	[character] Name of the treatment variable.
`strata`	[character] Strata variable on which to compute the average risk. Incompatible with treatment. Experimental.
`contrasts`	[character] The levels of the treatment variable to be compared.
`times`	[numeric vector] Time points at which to evaluate average treatment effects.
`cause`	[integer/character] the cause of interest.
`landmark`	for models with time-dependent covariates the landmark time(s) of evaluation. In this case, argument `time` may only be one value and for the prediction of risks it is assumed that that the covariates do not change between landmark and landmark+time.
`se`	[logical] If `TRUE` compute and add the standard errors to the output.
`iid`	[logical] If `TRUE` compute and add the influence function to the output.
`band`	[logical] If `TRUE` compute and add the quantiles for the confidence bands to the output.
`B`	[integer, >0] the number of bootstrap replications used to compute the confidence intervals. If it equals 0, then the influence function is used to compute Wald-type confidence intervals/bands.
`confint`	[logical] If `TRUE` compute and add the confidence intervals/bands to the output. They are computed applying the `confint` function to the output.
`seed`	[integer, >0] sed number used to generate seeds for bootstrap and to achieve reproducible results.
`handler`	[character] Parallel handler for bootstrap. either `"mclapply"` or `"foreach"`. if "foreach" use `doparallel` to create a cluster.
`mc.cores`	[integer, >0] The number of cores to use, i.e., the upper limit for the number of child processes that run simultaneously. Passed to `parallel::mclapply` or `doparallel::registerdoparallel`. The option is initialized from environment variable mc_cores if set.
`cl`	A parallel socket cluster used to perform cluster calculation in parallel. Output by `parallel::makeCluster`. The packages necessary to run the computations (e.g. riskRegression) must already be loaded on each worker.
`verbose`	[logical] If `TRUE` inform about estimated run time.
`store.iid`	[character] Implementation used to estimate the standard error. Can be `"full"` or `"minimal"`. `"minimal"` requires less memory but can only estimate the standard for the difference between treatment effects (and not for the ratio).
`...`	passed to predictRisk

Brice Ozenne broz@sund.ku.dk and Thomas Alexander Gerds tag@biostat.ku.dk

confint.ate to compute confidence intervals/bands. autoplot.ate to display the average risk. ateRobust to make the estimator doubly robust

library(survival)
library(rms)
library(prodlim)
set.seed(10)

#### Survival settings  ####
#### ATE with Cox model ####

## generate data
n <- 100
dtS <- sampleData(n, outcome="survival")
dtS$time <- round(dtS$time,1)
dtS$X1 <- factor(rbinom(n, prob = c(0.3,0.4) , size = 2), labels = paste0("T",0:2))

## estimate the Cox model
fit <- cph(formula = Surv(time,event)~ X1+X2,data=dtS,y=TRUE,x=TRUE)

## compute the ATE at times 5, 6, 7, and 8 using X1 as the treatment variable
## Not run: 
## only punctual estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit1b <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit1c <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, band = TRUE, B = 0)

## bootstrap confidence intervals
ateFit1c <- ate(fit, data = dtS, treatment = "X1", times = 5,
               seed = 3, se = TRUE, B = 100)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit1d <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit1e <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100, mc.cores = 2)

## End(Not run)

#### Survival settings without censoring ####
#### ATE with glm                        ####

## generate data
n <- 100
dtB <- sampleData(n, outcome="binary")
dtB[, X2 := as.numeric(X2)]

## estimate a logistic regression model
fit <- glm(formula = Y ~ X1+X2, data=dtB, family = "binomial")

## compute the ATE using X1 as the treatment variable
## only point estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtB, treatment = "X1", se = FALSE)
ateFit1a

## Not run: 
## standard error / confidence intervals computed using the influence function
ateFit1b <- ate(fit, data = dtB, treatment = "X1",
               times = 5, ## just for having a nice output not used in computations
               se = TRUE, B = 0)
ateFit1b

## standard error / confidence intervals computed using 100 boostrap samples
ateFit1d <- ate(fit, data = dtB, treatment = "X1",
                times = 5, se = TRUE, B = 100)
ateFit1d

## using lava
ateLava <- estimate(fit, function(p, data){
a <- p["(Intercept)"] ; b <- p["X11"] ; c <- p["X2"] ;
R.X11 <- expit(a + b + c * data[["X2"]])
R.X10 <- expit(a + c * data[["X2"]])
list(risk0=R.X10,risk1=R.X11,riskdiff=R.X10-R.X11)},
average=TRUE)
ateLava

## End(Not run)

#### Competing risks settings               ####
#### ATE with cause specific Cox regression ####

## Not run: 
## generate data
n <- 500
dt <- sampleData(n, outcome="competing.risks")
dt$time <- round(dt$time,1)
dt$X1 <- factor(rbinom(n, prob = c(0.2,0.3,0.2) , size = 3), labels = paste0("T",0:3))

## estimate cause specific Cox model
fitCR <-  CSC(Hist(time,event)~ X1+X8,data=dt,cause=1)

## compute the ATE at times 5, 6, 7, and 8 using X1 as the treatment variable
ateFit2a <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = FALSE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit2b <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit2c <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = TRUE, band = TRUE, B = 0)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit2d <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
                se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit2e <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
                se = TRUE, B = 100, mc.cores = 2)

## End(Not run)

#### time-dependent covariates ###
## Not run: 
library(survival)
fit <- coxph(Surv(time, status) ~ celltype+karno + age + trt, veteran)
vet2 <- survSplit(Surv(time, status) ~., veteran,
                       cut=c(60, 120), episode ="timegroup")
fitTD <- coxph(Surv(tstart, time, status) ~ celltype+karno + age + trt,
               data= vet2,x=1)
set.seed(16)
resVet <- ate(fitTD,formula=Hist(entry=tstart,time=time,event=status)~1,
          data = vet2, treatment = "celltype", contrasts = NULL,
        times=5,verbose=1,
        landmark = c(0,30,60,90), cause = 1, B = 4, se = 1,
        band = FALSE, mc.cores=1)
resVet

## End(Not run)

## Not run: 
set.seed(137)
d=sampleDataTD(127)
library(survival)
d[,status:=1*(event==1)]
## ignore competing risks
cox1TD <- coxph(Surv(start,time, status,type="counting") ~ X3+X5+X6+X8, data=d)
resTD1 <- ate(cox1TD,formula=Hist(entry=start,time=time,event=status)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD1
## adjust for competing risks
cscTD <- CSC(Hist(time=time, event=event,entry=start) ~ X3+X5+X6+X8, data=d)
set.seed(16)
resTD <- ate(cscTD,formula=Hist(entry=start,time=time,event=event)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), cause = 1, B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD

## End(Not run)