ate: Compute the Average Treatment Effects Via the G-formula

Description Usage Arguments Author(s) See Also Examples

Description

Use the g-formula to estimate the average treatment effect based on Cox regression with or without competing risks.

Usage

1
2
3
4
ate(object, data, formula, treatment, strata = NULL, contrasts = NULL,
  times, cause, landmark, se = TRUE, iid = FALSE, band = FALSE,
  B = 0, confint = (se + band) > 0, seed, handler = "foreach",
  mc.cores = 1, cl = NULL, verbose = TRUE, store.iid = "full", ...)

Arguments

object

Outcome model which describes how event risk depends on treatment and covariates. The object carry its own call and have a predictRisk method. See examples.

data

[data.frame or data.table] Data set in which to evaluate risk predictions based on the outcome model

formula

For analyses with time-dependent covariates, the response formula. See examples.

treatment

[character] Name of the treatment variable.

strata

[character] Strata variable on which to compute the average risk. Incompatible with treatment. Experimental.

contrasts

[character] The levels of the treatment variable to be compared.

times

[numeric vector] Time points at which to evaluate average treatment effects.

cause

[integer/character] the cause of interest.

landmark

for models with time-dependent covariates the landmark time(s) of evaluation. In this case, argument time may only be one value and for the prediction of risks it is assumed that that the covariates do not change between landmark and landmark+time.

se

[logical] If TRUE compute and add the standard errors to the output.

iid

[logical] If TRUE compute and add the influence function to the output.

band

[logical] If TRUE compute and add the quantiles for the confidence bands to the output.

B

[integer, >0] the number of bootstrap replications used to compute the confidence intervals. If it equals 0, then the influence function is used to compute Wald-type confidence intervals/bands.

confint

[logical] If TRUE compute and add the confidence intervals/bands to the output. They are computed applying the confint function to the output.

seed

[integer, >0] sed number used to generate seeds for bootstrap and to achieve reproducible results.

handler

[character] Parallel handler for bootstrap. either "mclapply" or "foreach". if "foreach" use doparallel to create a cluster.

mc.cores

[integer, >0] The number of cores to use, i.e., the upper limit for the number of child processes that run simultaneously. Passed to parallel::mclapply or doparallel::registerdoparallel. The option is initialized from environment variable mc_cores if set.

cl

A parallel socket cluster used to perform cluster calculation in parallel. Output by parallel::makeCluster. The packages necessary to run the computations (e.g. riskRegression) must already be loaded on each worker.

verbose

[logical] If TRUE inform about estimated run time.

store.iid

[character] Implementation used to estimate the standard error. Can be "full" or "minimal". "minimal" requires less memory but can only estimate the standard for the difference between treatment effects (and not for the ratio).

...

passed to predictRisk

Author(s)

Brice Ozenne broz@sund.ku.dk and Thomas Alexander Gerds tag@biostat.ku.dk

See Also

confint.ate to compute confidence intervals/bands. autoplot.ate to display the average risk. ateRobust to make the estimator doubly robust

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
library(survival)
library(rms)
library(prodlim)
set.seed(10)

#### Survival settings  ####
#### ATE with Cox model ####

## generate data
n <- 100
dtS <- sampleData(n, outcome="survival")
dtS$time <- round(dtS$time,1)
dtS$X1 <- factor(rbinom(n, prob = c(0.3,0.4) , size = 2), labels = paste0("T",0:2))

## estimate the Cox model
fit <- cph(formula = Surv(time,event)~ X1+X2,data=dtS,y=TRUE,x=TRUE)

## compute the ATE at times 5, 6, 7, and 8 using X1 as the treatment variable
## Not run: 
## only punctual estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit1b <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit1c <- ate(fit, data = dtS, treatment = "X1", times = 5:8,
               se = TRUE, band = TRUE, B = 0)

## bootstrap confidence intervals
ateFit1c <- ate(fit, data = dtS, treatment = "X1", times = 5,
               seed = 3, se = TRUE, B = 100)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit1d <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit1e <- ate(fit, data = dtS, treatment = "X1",
                times = 5:8, se = TRUE, B = 100, mc.cores = 2)

## End(Not run)

#### Survival settings without censoring ####
#### ATE with glm                        ####

## generate data
n <- 100
dtB <- sampleData(n, outcome="binary")
dtB[, X2 := as.numeric(X2)]

## estimate a logistic regression model
fit <- glm(formula = Y ~ X1+X2, data=dtB, family = "binomial")

## compute the ATE using X1 as the treatment variable
## only point estimate (argument se = FALSE)
ateFit1a <- ate(fit, data = dtB, treatment = "X1", se = FALSE)
ateFit1a

## Not run: 
## standard error / confidence intervals computed using the influence function
ateFit1b <- ate(fit, data = dtB, treatment = "X1",
               times = 5, ## just for having a nice output not used in computations
               se = TRUE, B = 0)
ateFit1b

## standard error / confidence intervals computed using 100 boostrap samples
ateFit1d <- ate(fit, data = dtB, treatment = "X1",
                times = 5, se = TRUE, B = 100)
ateFit1d

## using lava
ateLava <- estimate(fit, function(p, data){
a <- p["(Intercept)"] ; b <- p["X11"] ; c <- p["X2"] ;
R.X11 <- expit(a + b + c * data[["X2"]])
R.X10 <- expit(a + c * data[["X2"]])
list(risk0=R.X10,risk1=R.X11,riskdiff=R.X10-R.X11)},
average=TRUE)
ateLava

## End(Not run)

#### Competing risks settings               ####
#### ATE with cause specific Cox regression ####

## Not run: 
## generate data
n <- 500
dt <- sampleData(n, outcome="competing.risks")
dt$time <- round(dt$time,1)
dt$X1 <- factor(rbinom(n, prob = c(0.2,0.3,0.2) , size = 3), labels = paste0("T",0:3))

## estimate cause specific Cox model
fitCR <-  CSC(Hist(time,event)~ X1+X8,data=dt,cause=1)

## compute the ATE at times 5, 6, 7, and 8 using X1 as the treatment variable
ateFit2a <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = FALSE)

## standard error / confidence intervals computed using the influence function
## (argument se = TRUE and B = 0)
ateFit2b <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = TRUE, B = 0)

## same as before with in addition the confidence bands for the ATE
## (argument band = TRUE)
ateFit2c <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
               se = TRUE, band = TRUE, B = 0)

## standard error / confidence intervals computed using 100 boostrap samples
## (argument se = TRUE and B = 100) 
ateFit2d <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
                se = TRUE, B = 100)
## NOTE: for real applications 100 bootstrap samples is not enougth 

## same but using 2 cpus for generating and analyzing the boostrap samples
## (parallel computation, argument mc.cores = 2) 
ateFit2e <- ate(fitCR, data = dt, treatment = "X1", times = 5:8, cause = 1,
                se = TRUE, B = 100, mc.cores = 2)

## End(Not run)

#### time-dependent covariates ###
## Not run: 
library(survival)
fit <- coxph(Surv(time, status) ~ celltype+karno + age + trt, veteran)
vet2 <- survSplit(Surv(time, status) ~., veteran,
                       cut=c(60, 120), episode ="timegroup")
fitTD <- coxph(Surv(tstart, time, status) ~ celltype+karno + age + trt,
               data= vet2,x=1)
set.seed(16)
resVet <- ate(fitTD,formula=Hist(entry=tstart,time=time,event=status)~1,
          data = vet2, treatment = "celltype", contrasts = NULL,
        times=5,verbose=1,
        landmark = c(0,30,60,90), cause = 1, B = 4, se = 1,
        band = FALSE, mc.cores=1)
resVet

## End(Not run)

## Not run: 
set.seed(137)
d=sampleDataTD(127)
library(survival)
d[,status:=1*(event==1)]
## ignore competing risks
cox1TD <- coxph(Surv(start,time, status,type="counting") ~ X3+X5+X6+X8, data=d)
resTD1 <- ate(cox1TD,formula=Hist(entry=start,time=time,event=status)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD1
## adjust for competing risks
cscTD <- CSC(Hist(time=time, event=event,entry=start) ~ X3+X5+X6+X8, data=d)
set.seed(16)
resTD <- ate(cscTD,formula=Hist(entry=start,time=time,event=event)~1,
        data = d, treatment = "X3", contrasts = NULL,
        times=.5,verbose=1,
        landmark = c(0,0.5,1), cause = 1, B = 20, se = 1,
        band = FALSE, mc.cores=1)
resTD

## End(Not run)

bozenne/riskRegressionLight documentation built on May 7, 2019, 12:52 a.m.