CFsurvival | R Documentation |
This function estimates counterfactual survival functions and contrasts from right-censored data subject to potential confounding.
CFsurvival(
time,
event,
treat,
confounders,
fit.times = sort(unique(time[time > 0 & time < max(time[event == 1])])),
fit.treat = c(0, 1),
nuisance.options = list(),
conf.band = TRUE,
conf.level = 0.95,
contrasts = c("surv.diff", "surv.ratio"),
verbose = FALSE
)
time |
|
event |
|
treat |
|
confounders |
|
fit.times |
|
fit.treat |
Optional subset of |
nuisance.options |
List of options for nuisance parameter specification. See |
conf.band |
Logical indicating whether to compute simultaneous confidence bands. |
conf.level |
Desired coverage of confidence intervals/bands. |
contrasts |
Character vector indicating which (if any) contrasts of the survival functions are desired. Can include |
verbose |
Logical indicating whether progress should be printed to the command line. |
For a binary treatment measured at baseline, the counterfactual survival through time t
for treatment level a
is defined as the probability of survival up to time t
were all units in the population assigned to treatment a
. This quantity is estimated using an augmented inverse probability of treatment/censoring weighted (AIPTW) estimator. Our estimator of the counterfactual survival function requires further estimation of three nuisance parameters: the conditional probability of receiving treatment given a set of potential confounders (i.e. the treatment propensity), the conditional survival function of the event given confounders, and the conditional survival function of censoring given confounders. At the moment, we provide two ways of estimating each of these nuisance parameters. The treatment propensity can be estimated either using a logistic regression or SuperLearner (van der Laan and Polley, 2006). The conditional survival functions of event and censoring can be estimated using Cox proportional hazard models or survival random forests (Ishwaran et al.). Additionally, the estimates of any of these parameters can be passed in directly if another method is desired.
CFsurvival
by default returns 95% pointwise confidence intervals and 95% uniform confidence bands, both of which are based on the influence function of the estimator. Contrasts of the counterfactual survival under treatment and control can also be obtained, including the survival difference, survival ratio, risk ratio, and number needed to treat. Pointwise confidence intervals and uniform confidence bands for these contrasts will also be computed.
surv.df$ptwise.lower
and surv.df$ptwise.upper
are lower and upper endpoints of back-transformed pointwise confidence intervals on the logit scale. Pointwise confidence intervals are only provided for values of fit.times
that are at least as large as the smallest observed event time in the corresponding treatment cohort. surv.df$unif.ew.lower
and surv.df$unif.ew.upper
are lower and upper endpoints of an equal-width confidence band. surv.df$unif.logit.lower
and surv.df$unif.logit.upper
are lower and upper endpoints of an equi-precision confidence band on the logit scale, and trasformed back to the survival scale. This confidence band only covers values of fit.times
such that the corresponding estimated survival function is <= .99
and >= .01
.
CFsurvival
returns a named list with the following elements:
fit.times |
The time points at which the counterfactual survival curves (and contrasts) were fit. |
fit.treat |
The treatment values for which the counterfactual survival curves were fit. |
surv.df |
A data frame with the estimated counterfactual survival functions, estimated standard errors, confidence interval (ptwise.lower/ptwise.upper is constructed on the survival scale, and ptwise.logit.lower/ptwise.logit.upper is constructed on the logit scale), confidence bands (unif.ew.lower/unif.ew.upper is an equal-width confidence band on the survival scale, and unif.logit.lower/unif.logit.upper is an equi-precision confidence band on the logit scale). |
IF.vals.0, IF.vals.1 |
n x k matrices with the influence values for the counterfactual survival functions. Rows index observations, columns index time points. |
surv.diff.df |
A data frame with the estimated counterfactual survival difference (treatment survival minus control survival), as well as confidence intervals and tests of the null hypothesis that the difference equals zero. Only returned if |
surv.ratio.df |
A data frame with the estimated counterfactual survival ratio (treatment survival divided by control survival), as well as confidence intervals and tests of the null hypothesis that the ratio equals one. Note that standard errors and confidence intervals are first computed on the log scale, then exponentiated. Only returned if |
risk.ratio.df |
A data frame with the estimated counterfactual risk ratio (treatment risk divided by control risk), as well as confidence intervals and tests of the null hypothesis that the ratio equals one. Note that standard errors and confidence intervals are first computed on the log scale, then exponentiated. Only returned if |
nnt.df |
A data frame with the estimated counterfactual number needed to treat (one divided by the survival difference), as well as confidence intervals and tests of the null hypothesis that the ratio equals one. Note that standard errors and confidence intervals are first computed on the log scale, then exponentiated. Only returned if |
surv.0.unif.ew.sim.maxes, surv.0.logit.sim.maxes, surv.1.unif.ew.sim.maxes, surv.1.logit.sim.maxes |
Samples from the approximate distribution of the maximum of the limiting Gaussian processes of the counterfactual survival functions. |
surv.0.unif.ew.quant, surv.0.logit.quant, surv.1.unif.ew.quant, surv.1.logit.quant, surv.diff.unif.quant, log.risk.ratio.unif.quant, log.surv.ratio.unif.quant, nnt.unif.quant |
Estimated quantile of the approximate distribution of the maximum of the limiting Gaussian processes of the counterfactual survival functions or contrast function. |
prop.fits |
The estimated treatment propensities (if |
surv.fits |
The estimated conditional survival functions (if |
nuisance |
Predicted nuisance function values and details of the nuisance function fits. |
data |
The original time, event, and treatment data supplied to the function. |
# Define parameters
n <- 300
expit <- function(x) 1/(1 + exp(-x))
betaT <- 2; lambdaT <- 20; betaC <- 2; lambdaC <- 15
# Define the true survival functions of the control (S0) and treatment (S1) groups
S0 <- function(t) sapply(t, function(t0) integrate(function(w) .5 * pweibull(t0, shape=betaT, scale=lambdaT * exp(-w-1), lower.tail = FALSE), lower=-1, upper=1)$value)
S1 <- function(t) sapply(t, function(t0) integrate(function(w) .5 * pweibull(t0, shape=betaT, scale=lambdaT * exp(-w), lower.tail = FALSE), lower=-1, upper=1)$value)
# Simulate data
covar <- runif(n, min=-1, max=1)
g0s <- expit(.2 - covar)
rx <- rbinom(n, size=1, prob=g0s)
event.time <- rweibull(n, shape = betaT, scale = lambdaT * exp(-covar - 1 + rx))
cens.time <- rweibull(n, shape = betaC, scale = lambdaC * exp(-covar/5 - rx/5))
cens.time[cens.time > 15] <- 15
obs.time <- pmin(event.time, cens.time)
obs.event <- as.numeric(event.time <= cens.time)
# Estimate the CF survivals
fit <- CFsurvival(time = obs.time, event = obs.event, treat = rx, confounders = data.frame(covar), contrasts = c("surv.diff","surv.ratio", "risk.ratio","nnt"), verbose = TRUE, fit.times = seq(0, 14.5, by=.1), fit.treat = c(0,1),
nuisance.options = list(prop.SL.library = c("SL.mean", "SL.bayesglm"),
event.SL.library = c("survSL.km", "survSL.coxph", "survSL.weibreg", "survSL.expreg"),
cens.SL.library = c("survSL.km", "survSL.coxph", "survSL.weibreg", "survSL.expreg"),
cross.fit =TRUE, V = 5, save.nuis.fits = TRUE), conf.band = TRUE, conf.level = .95)
# It is a good idea to check the min/max of the propensity estimates and the min of the censoring estimates
# If they are very small, there may be positivity violations, which may result in invalid inference.
min(fit$nuisance$prop.pred)
max(fit$nuisance$prop.pred)
min(fit$nuisance$cens.pred.0)
min(fit$nuisance$cens.pred.1)
# Plot the output
## Not run:
library(ggplot2)
# First plot the survival curves + conf intervals + conf bands
fit$surv.df$true.surv <- c(S1(c(0, fit$fit.times)), S0(c(0, fit$fit.times)))
ggplot(fit$surv.df) +
geom_line(aes(time, true.surv, group=trt), color='black') +
geom_step(aes(time, surv, color=as.factor(trt), group=trt)) +
geom_step(aes(time, ptwise.lower, color=as.factor(trt), group=trt), linetype=2) +
geom_step(aes(time, ptwise.upper, color=as.factor(trt), group=trt), linetype=2) +
geom_step(aes(time, unif.logit.lower, color=as.factor(trt), group=trt), linetype=3) +
geom_step(aes(time, unif.logit.upper, color=as.factor(trt), group=trt), linetype=3) +
scale_color_discrete("Treatment") +
xlab("Time") +
ylab("Survival") +
coord_cartesian(xlim=c(0,15), ylim=c(0,1))
# Next plot the survival difference
fit$surv.diff.df$true.surv.diff <- c(S1(fit$surv.diff.df$time) - S0(fit$surv.diff.df$time))
ggplot(fit$surv.diff.df) +
geom_line(aes(time, true.surv.diff), color='red') +
geom_line(aes(time, surv.diff)) +
geom_line(aes(time, ptwise.lower), linetype=2) +
geom_line(aes(time, ptwise.upper), linetype=2) +
geom_line(aes(time, unif.lower), linetype=3) +
geom_line(aes(time, unif.upper), linetype=3) +
xlab("Time") +
ylab("Survival difference (treatment - control)") +
coord_cartesian(xlim=c(0,15))
fit$surv.ratio.df$true.surv.ratio <- c(S1(fit$surv.ratio.df$time) / S0(fit$surv.ratio.df$time))
ggplot(fit$surv.ratio.df) +
geom_line(aes(time, true.surv.ratio), color='red') +
geom_line(aes(time, surv.ratio)) +
geom_line(aes(time, ptwise.lower), linetype=2) +
geom_line(aes(time, ptwise.upper), linetype=2) +
geom_line(aes(time, unif.lower), linetype=3) +
geom_line(aes(time, unif.upper), linetype=3) +
xlab("Time") +
ylab("Survival ratio (treatment / control)") +
geom_hline(yintercept=0, color='blue', linetype=4) +
coord_cartesian(xlim=c(0,15), ylim=c(1,10))
# Risk ratio
fit$risk.ratio.df$true.risk.ratio <- (1 - S1(fit$risk.ratio.df$time)) / (1 - S0(fit$risk.ratio.df$time))
ggplot(fit$risk.ratio.df) +
geom_line(aes(time, true.risk.ratio), color='red') +
geom_line(aes(time, risk.ratio)) +
geom_line(aes(time, ptwise.lower), linetype=2) +
geom_line(aes(time, ptwise.upper), linetype=2) +
geom_line(aes(time, unif.lower), linetype=3) +
geom_line(aes(time, unif.upper), linetype=3) +
coord_cartesian(xlim=c(0,15), ylim=c(0,1)) +
xlab("Time") +
ylab("Risk ratio (treatment / control)")
# Number needed to treat
fit$nnt.df$true.nnt <- 1/(S1(fit$nnt.df$time) - S0(fit$nnt.df$time))
ggplot(fit$nnt.df) +
geom_line(aes(time, true.nnt), color='red') +
geom_line(aes(time, nnt)) +
geom_line(aes(time, ptwise.lower), linetype=2) +
geom_line(aes(time, ptwise.upper), linetype=2) +
geom_line(aes(time, unif.lower), linetype=3) +
geom_line(aes(time, unif.upper), linetype=3) +
coord_cartesian(xlim=c(0,15), ylim=c(0, 10)) +
xlab("Time") +
ylab("Number needed to treat (NNT)")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.