cindex: Concordance index for right censored survival time data
In pec: Prediction Error Curves for Risk Prediction Models in Survival Analysis

cindex

R Documentation

Concordance index for right censored survival time data

Description

In survival analysis, a pair of patients is called concordant if the risk of the event predicted by a model is lower for the patient who experiences the event at a later timepoint. The concordance probability (C-index) is the frequency of concordant pairs among all pairs of subjects. It can be used to measure and compare the discriminative power of a risk prediction models. The function provides an inverse of the probability of censoring weigthed estimate of the concordance probability to adjust for right censoring. Cross-validation based on bootstrap resampling or bootstrap subsampling can be applied to assess and compare the discriminative power of various regression modelling strategies on the same set of data.

Usage

cindex(
  object,
  formula,
  data,
  eval.times,
  pred.times,
  cause,
  lyl = FALSE,
  cens.model = "marginal",
  ipcw.refit = FALSE,
  ipcw.args = NULL,
  ipcw.limit,
  tiedPredictionsIn = TRUE,
  tiedOutcomeIn = TRUE,
  tiedMatchIn = TRUE,
  splitMethod = "noPlan",
  B,
  M,
  model.args = NULL,
  model.parms = NULL,
  keep.models = FALSE,
  keep.residuals = FALSE,
  keep.pvalues = FALSE,
  keep.weights = FALSE,
  keep.index = FALSE,
  keep.matrix = FALSE,
  multiSplitTest = FALSE,
  testTimes,
  confInt = FALSE,
  confLevel = 0.95,
  verbose = TRUE,
  savePath = NULL,
  slaveseed = NULL,
  na.action = na.fail,
  ...
)

Arguments

`object`	A named list of prediction models, where allowed entries are (1) R-objects for which a predictSurvProb method exists (see details), (2) a `call` that evaluates to such an R-object (see examples), (3) a matrix with predicted probabilities having as many rows as `data` and as many columns as `times`. For cross-validation all objects in this list must include their `call`.
`formula`	A survival formula. The left hand side is used to finde the status response variable in `data`. For right censored data, the right hand side of the formula is used to specify conditional censoring models. For example, set `Surv(time,status)~x1+x2` and `cens.model="cox"`. Then the weights are based on a Cox regression model for the censoring times with predictors x1 and x2. Note that the usual coding is assumed: `status=0` for censored times and that each variable name that appears in `formula` must be the column name in `data`. If there are no covariates, i.e. `formula=Surv(time,status)~1` the `cens.model` is coerced to `"marginal"` and the Kaplan-Meier estimator for the censoring times is used to calculate the weights. If `formula` is `missing`, try to extract a formula from the first element in object.
`data`	A data frame in which to validate the prediction models and to fit the censoring model. If `data` is missing, try to extract a data set from the first element in object.
`eval.times`	A vector of timepoints for evaluating the discriminative ability. At each timepoint the c-index is computed using only those pairs where one of the event times is known to be earlier than this timepoint. If `eval.times` is `missing` then the largest uncensored event time is used.
`pred.times`	A vector of timepoints for evaluating the prediction models. This should either be exactly one timepoint used for all `eval.times`, or be as long as `eval.times`, in which case the predicted order of risk for the jth entry of `eval.times` is based on the jth entry of `pred.times` corresponding
`cause`	For competing risks, the event of interest. Defaults to the first state of the response, which is obtained by evaluating the left hand side of `formula` in `data`.
`lyl`	If `TRUE` rank subjects accoring to predicted life-years-lost (See Andersen due to this cause instead of predicted risk.
`cens.model`	Method for estimating inverse probability of censoring weigths: `cox`: A semi-parametric Cox proportional hazard model is fitted to the censoring times `marginal`: The Kaplan-Meier estimator for the censoring times `nonpar`: Nonparametric extension of the Kaplan-Meier for the censoring times using symmetric nearest neighborhoods – available for arbitrary many strata variables on the right hand side of argument `formula` but at most one continuous variable. See the documentation of the functions `prodlim` and `neighborhood` from the prodlim package. `aalen`: The nonparametric Aalen additive model fitted to the censoring times. Requires the timereg package maintained by Thomas Scheike.
`ipcw.refit`	If `TRUE` the inverse probability of censoring weigths are estimated separately in each training set during cross-validation.
`ipcw.args`	List of arguments passed to function specified by argument `cens.model`.
`ipcw.limit`	Value between 0 and 1 (but no equal to 0!) used to cut for small weights in order to stabilize the estimate at late times were few individuals are observed.
`tiedPredictionsIn`	If `FALSE` pairs with identical predictions are excluded, unless also the event times are identical and uncensored and `tiedMatchIn` is set to `TRUE`.
`tiedOutcomeIn`	If `TRUE` pairs with identical and uncensored event times are excluded, unless also the predictions are identical and `tiedMatchIn` is set to `TRUE`.
`tiedMatchIn`	If `TRUE` then pairs with identical predictions and identical and uncensored event times are counted as concordant pairs.
`splitMethod`	Defines the internal validation design: `none/noPlan`: Assess the models in the give `data`, usually either in the same data where they are fitted, or in independent test data. `BootCv`: Bootstrap cross validation. The prediction models are trained on `B` bootstrap samples, that are either drawn with replacement of the same size as the original data or without replacement from `data` of the size `M`. The models are assessed in the observations that are NOT in the bootstrap sample. `Boot632`: Linear combination of AppCindex and OutOfBagCindex using the constant weight .632.
`B`	Number of bootstrap samples. The default depends on argument `splitMethod`. When `splitMethod` in c("BootCv","Boot632") the default is 100. For `splitMethod="none"` `B` is the number of bootstrap simulations e.g. to obtain bootstrap confidence limits – default is 0.
`M`	The size of the bootstrap samples for resampling without replacement. Ignored for resampling with replacement.
`model.args`	List of extra arguments that can be passed to the `predictSurvProb` methods. The list must have an entry for each entry in `object`.
`model.parms`	Experimental. List of with exactly one entry for each entry in `object`. Each entry names parts of the value of the fitted models that should be extracted and added to the value.
`keep.models`	Logical. If `TRUE` keep the models in object. Since fitted models can be large objects the default is `FALSE`.
`keep.residuals`	Experimental.
`keep.pvalues`	Experimental.
`keep.weights`	Experimental.
`keep.index`	Logical. If `FALSE` remove the bootstrap or cross-validation index from the output list which otherwise is included in the method part of the output list.
`keep.matrix`	Logical. If `TRUE` add all `B` prediction error curves from bootstrapping or cross-validation to the output.
`multiSplitTest`	Experimental.
`testTimes`	A vector of time points for testing differences between models in the time-point specific Brier scores.
`confInt`	Experimental.
`confLevel`	Experimental.
`verbose`	if `TRUE` report details of the progress, e.g. count the steps in cross-validation.
`savePath`	Place in your filesystem (directory) where training models fitted during cross-validation are saved. If `missing` training models are not saved.
`slaveseed`	Vector of seeds, as long as `B`, to be given to the slaves in parallel computing.
`na.action`	Passed immediately to model.frame. Defaults to na.fail. If set otherwise most prediction models will not work.
`...`	Not used.

Details

Pairs with identical observed times, where one is uncensored and one is censored, are always considered usuable (independent of the value of tiedOutcomeIn), as it can be assumed that the event occurs at a later timepoint for the censored observation.

For uncensored response the result equals the one obtained with the functions rcorr.cens and rcorrcens from the Hmisc package (see examples).

Value

Estimates of the C-index.

Author(s)

Thomas A Gerds tag@biostat.ku.dk

References

TA Gerds, MW Kattan, M Schumacher, and C Yu. Estimating a time-dependent concordance index for survival prediction models with covariate dependent censoring. Statistics in Medicine, Ahead of print:to appear, 2013. DOI = 10.1002/sim.5681

Wolbers, M and Koller, MT and Witteman, JCM and Gerds, TA (2013) Concordance for prognostic models with competing risks Research report 13/3. Department of Biostatistics, University of Copenhagen

Andersen, PK (2012) A note on the decomposition of number of life years lost according to causes of death Research report 12/2. Department of Biostatistics, University of Copenhagen

Paul Blanche, Michael W Kattan, and Thomas A Gerds. The c-index is not proper for the evaluation of-year predicted risks. Biostatistics, 20(2): 347–357, 2018.

Examples


 # simulate data based on Weibull regression  
library(prodlim)
 set.seed(13)
 dat <- SimSurv(100)
 # fit three different Cox models and a random survival forest
 # note: low number of trees for the purpose of illustration 
 library(survival)
 cox12 <- coxph(Surv(time,status)~X1+X2,data=dat,x=TRUE,y=TRUE)
 cox1 <- coxph(Surv(time,status)~X1,data=dat,x=TRUE,y=TRUE)
 cox2 <- coxph(Surv(time,status)~X2,data=dat,x=TRUE,y=TRUE)
 #
 # compute the apparent estimate of the C-index at a single time point
 #
A1  <- pec::cindex(list("Cox X1"=cox1),
		  formula=Surv(time,status)~X1+X2,
		  data=dat,
		  eval.times=10)
 #
 # compute the apparent estimate of the C-index at different time points
 #
ApparrentCindex  <- pec::cindex(list("Cox X1"=cox1,
		       "Cox X2"=cox2,
		       "Cox X1+X2"=cox12),
		  formula=Surv(time,status)~X1+X2,
		  data=dat,
		  eval.times=seq(1,15,1))
  print(ApparrentCindex)
  plot(ApparrentCindex)
 #
 # compute the bootstrap-crossvalidation estimate of
 # the C-index at different time points
 #
set.seed(142)
bcvCindex  <- pec::cindex(list("Cox X1"=cox1,
		       "Cox X2"=cox2,
		       "Cox X1+X2"=cox12),
		  formula=Surv(time,status)~X1+X2,
		  data=dat,
                  splitMethod="bootcv",
                  B=5,
 		  eval.times=seq(1,15,1))
  print(bcvCindex)
  plot(bcvCindex)
 # for uncensored data the results are the same
 # as those obtained with the function rcorr.cens from Hmisc

set.seed(16)
dat <- SimSurv(30)
dat$staus=1
fit12 <- coxph(Surv(time,status)~X1+X2,data=dat,x=TRUE,y=TRUE)
fit1 <- coxph(Surv(time,status)~X1,data=dat,x=TRUE,y=TRUE)
fit2 <- coxph(Surv(time,status)~X2,data=dat,x=TRUE,y=TRUE)
Cpec <- pec::cindex(list("Cox X1+X2"=fit12,"Cox X1"=fit1,"Cox X2"=fit2),
	       formula=Surv(time,status)~1,
	       data=dat) 	       
p1 <- predictSurvProb(fit1,newdata=dat,times=10)
p2 <- predictSurvProb(fit2,newdata=dat,times=10)
p12 <- predictSurvProb(fit12,newdata=dat,times=10)
if (requireNamespace("Hmisc",quietly=TRUE)){
library(Hmisc)
harrelC1 <- rcorr.cens(p1,with(dat,Surv(time,status)))
harrelC2 <- rcorr.cens(p2,with(dat,Surv(time,status)))
harrelC12 <- rcorr.cens(p12,with(dat,Surv(time,status)))
harrelC1[["C Index"]]==Cpec$AppCindex[["Cox.X1"]]
harrelC2[["C Index"]]==Cpec$AppCindex[["Cox.X2"]]
harrelC12[["C Index"]]==Cpec$AppCindex[["Cox.X1.X2"]]
}
 #
 # competing risks 
 #
library(riskRegression)
library(prodlim)
set.seed(30)
dcr.learn <- SimCompRisk(30)
dcr.val <- SimCompRisk(30)
pec::cindex(CSC(Hist(time,event)~X1+X2,data=dcr.learn),data=dcr.val)
fit <- CSC(Hist(time,event)~X1+X2,data=dcr.learn)
cif <- predictRisk(fit,newdata=dcr.val,times=3,cause=1)
pec::cindex(list(fit),data=dcr.val,times=3)

pec documentation built on April 11, 2023, 5:55 p.m.