predictCox | R Documentation |
Routine to get baseline hazards and subject specific hazards
as well as survival probabilities from a survival::coxph
or rms::cph
object.
predictCox(
object,
times,
newdata = NULL,
centered = TRUE,
type = c("cumhazard", "survival"),
keep.strata = TRUE,
keep.times = TRUE,
keep.newdata = FALSE,
keep.infoVar = FALSE,
se = FALSE,
band = FALSE,
iid = FALSE,
confint = (se + band) > 0,
diag = FALSE,
average.iid = FALSE,
product.limit = FALSE,
store = NULL
)
object |
The fitted Cox regression model object either
obtained with |
times |
[numeric vector] Time points at which to return the estimated hazard/cumulative hazard/survival. |
newdata |
[data.frame or data.table] Contain the values of the predictor variables
defining subject specific predictions.
Should have the same structure as the data set used to fit the |
centered |
[logical] If |
type |
[character vector] the type of predicted value. Choices are
Several choices can be
combined in a vector of strings that match (no matter the case)
strings |
keep.strata |
[logical] If |
keep.times |
[logical] If |
keep.newdata |
[logical] If |
keep.infoVar |
[logical] For internal use. |
se |
[logical] If |
band |
[logical] If |
iid |
[logical] If |
confint |
[logical] If |
diag |
[logical] when |
average.iid |
[logical] If |
product.limit |
[logical]. If |
store |
[vector of length 2] Whether prediction should only be computed for unique covariate sets and mapped back to the original dataset ( |
When the argument newdata
is not specified, the function computes the baseline hazard estimate.
See (Ozenne et al., 2017) section "Handling of tied event times".
Otherwise the function computes survival probabilities with confidence intervals/bands. See (Ozenne et al., 2017) section "Confidence intervals and confidence bands for survival probabilities". The survival is computed using the exponential approximation (equation 3).
A detailed explanation about the meaning of the argument store = c(iid="full")
can be found
in (Ozenne et al., 2017) Appendix B "Saving the influence functions".
The function is not compatible with time varying predictor variables.
The centered argument enables us to reproduce the results obtained with the basehaz
function from the survival package but should not be modified by the user.
The iid decomposition is output in an attribute called "iid"
,
using an array containing the value of the influence of each subject used to fit the object (dim 1), for each subject in newdata (dim 3), and each time (dim 2).
Brice Ozenne broz@sund.ku.dk, Thomas A. Gerds tag@biostat.ku.dk
Brice Ozenne, Anne Lyngholm Sorensen, Thomas Scheike, Christian Torp-Pedersen and Thomas Alexander Gerds. riskRegression: Predicting the Risk of an Event using Cox Regression Models. The R Journal (2017) 9:2, pages 440-460.
confint.predictCox
to compute confidence intervals/bands.
autoplot.predictCox
to display the predictions.
library(survival)
library(data.table)
#######################
#### generate data ####
#######################
set.seed(10)
d <- sampleData(50,outcome="survival") ## training dataset
nd <- sampleData(5,outcome="survival") ## validation dataset
## add ties
d$time.round <- round(d$time,1)
any(duplicated(d$time.round))
## add categorical variables
d$U <- sample(letters[1:3],replace=TRUE,size=NROW(d))
d$V <- sample(letters[5:8],replace=TRUE,size=NROW(d))
nd <- cbind(nd,d[sample(NROW(d),replace=TRUE,size=NROW(nd)),c("U","V")])
######################
#### Kaplan Meier ####
######################
#### no ties
fit.KM <- coxph(Surv(time,event)~ 1, data=d, x = TRUE, y = TRUE)
predictCox(fit.KM) ## exponential approximation
ePL.KM <- predictCox(fit.KM, product.limit = TRUE) ## product-limit
as.data.table(ePL.KM)
range(survfit(Surv(time,event)~1, data = d)$surv - ePL.KM$survival) ## same
#### with ties (exponential approximation, Efron estimator for the baseline hazard)
fit.KM.ties <- coxph(Surv(time.round,event)~ 1, data=d, x = TRUE, y = TRUE)
predictCox(fit.KM)
## retrieving survfit results with ties
fit.KM.ties <- coxph(Surv(time.round,event)~ 1, data=d,
x = TRUE, y = TRUE, ties = "breslow")
ePL.KM.ties <- predictCox(fit.KM, product.limit = TRUE)
range(survfit(Surv(time,event)~1, data = d)$surv - ePL.KM.ties$survival) ## same
#################################
#### Stratified Kaplan Meier ####
#################################
fit.SKM <- coxph(Surv(time,event)~strata(X2), data=d, x = TRUE, y = TRUE)
ePL.SKM <- predictCox(fit.SKM, product.limit = TRUE)
ePL.SKM
range(survfit(Surv(time,event)~X2, data = d)$surv - ePL.SKM$survival) ## same
###################
#### Cox model ####
###################
fit.Cox <- coxph(Surv(time,event)~X1 + X2 + X6, data=d, x = TRUE, y = TRUE)
#### compute the baseline cumulative hazard
Cox.haz0 <- predictCox(fit.Cox)
Cox.haz0
range(survival::basehaz(fit.Cox)$hazard-Cox.haz0$cumhazard) ## same
#### compute individual specific cumulative hazard and survival probabilities
## exponential approximation
fit.predCox <- predictCox(fit.Cox, newdata=nd, times=c(3,8),
se = TRUE, band = TRUE)
fit.predCox
## product-limit
fitPL.predCox <- predictCox(fit.Cox, newdata=nd, times=c(3,8),
product.limit = TRUE, se = TRUE)
fitPL.predCox
## Note: product limit vs. exponential does not affect uncertainty quantification
range(fitPL.predCox$cumhazard.se - fit.predCox$cumhazard.se) ## same
range(fitPL.predCox$survival.se - fit.predCox$survival.se) ## different
## except through a multiplicative factor (multiply by survival in the delta method)
fitPL.predCox$survival.se - fitPL.predCox$cumhazard.se * fitPL.predCox$survival
#### left truncation
test2 <- list(start=c(1,2,5,2,1,7,3,4,8,8),
stop=c(2,3,6,7,8,9,9,9,14,17),
event=c(1,1,1,1,1,1,1,0,0,0),
x=c(1,0,0,1,0,1,1,1,0,0))
m.cph <- coxph(Surv(start, stop, event) ~ 1, test2, x = TRUE)
as.data.table(predictCox(m.cph))
basehaz(m.cph)
##############################
#### Stratified Cox model ####
##############################
#### one strata variable
fit.SCox <- coxph(Surv(time,event)~X1+strata(X2)+X6, data=d, x = TRUE, y = TRUE)
predictCox(fit.SCox, newdata=nd, times = c(3,12))
predictCox(fit.SCox, newdata=nd, times = c(3,12), product.limit = TRUE)
## NA: timepoint is beyond the last observation in the strata and censoring
tapply(d$time,d$X2,max)
#### two strata variables
fit.S2Cox <- coxph(Surv(time,event)~X1+strata(U)+strata(V)+X2,
data=d, x = TRUE, y = TRUE)
base.S2Cox <- predictCox(fit.S2Cox)
range(survival::basehaz(fit.S2Cox)$hazard - base.S2Cox$cumhazard) ## same
predictCox(fit.S2Cox, newdata=nd, times = 3)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.