Description Usage Arguments Details Value Author(s) References Examples
Methods to score the predictive performance of risk markers and risk prediction models
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40  ## S3 method for class 'list'
Score(
object,
formula,
data,
metrics = c("auc", "brier"),
summary = NULL,
plots = NULL,
cause,
times,
landmarks,
use.event.times = FALSE,
null.model = TRUE,
se.fit = TRUE,
conservative = FALSE,
multi.split.test = FALSE,
conf.int = 0.95,
contrasts = TRUE,
probs = c(0, 0.25, 0.5, 0.75, 1),
cens.method = "ipcw",
cens.model = "cox",
split.method,
B,
M,
seed,
trainseeds,
parallel = c("no", "multicore", "snow", "as.registered"),
ncpus = 1,
cl = NULL,
progress.bar = 3,
keep,
predictRisk.args,
debug = 0L,
useEventTimes,
nullModel,
censMethod,
censModel,
splitMethod,
...
)

object 
List of risk predictions (see details and examples). 
formula 
A formula which identifies the outcome (left hand
side). E.g., 
data 

metrics 
Character vector specifying which metrics to
apply. Case does not matter. Choices are 
summary 
Character vector specifying which summary
statistics to apply to the predicted risks. Choices are
Set to 
plots 
Character vector specifying for which plots to put data into the result.
Currently implemented are 
cause 
Event of interest. Used for binary outcome 
times 
For survival and competing risks outcome: list of prediction horizons. All times which are greater than the maximal observed time in the data set are automatically removed. Note that the object returned by the function may become huge when the prediction performance is estimated at many prediction horizons. 
landmarks 
Not yet implemented. 
use.event.times 
If 
null.model 
If 
se.fit 
Logical or 
conservative 
Logical, only relevant in right censored data. If 
multi.split.test 
Logical or 
conf.int 
Either logical or a numeric value between 0 and 1. In right censored data,
confidence intervals are based on Blanche et al (see references). Setting 
contrasts 
Either logical or a list of contrasts. A list of contrasts defines which risk prediction models (markers)
should be contrasted with respect to their prediction performance.
If 
probs 
Quantiles for retrospective summary statistics of the
predicted risks. This affects the result of the function 
cens.method 
Method for dealing with right censored
data. Either 
cens.model 
Model for estimating inverse probability of
censored weights. Implemented are the KaplanMeier method ( 
split.method 
Method for crossvalidation. Right now the only choice is 
B 
Number of bootstrap sets for crossvalidation. 
M 
Size of subsamples for bootstrap crossvalidation. If specified it
has to be an integer smaller than the size of 
seed 
Super seed for setting training data seeds when randomly splitting (bootstrapping) the data during crossvalidation. 
trainseeds 
Seeds for training models during crossvalidation. 
parallel 
The type of parallel operation to be used (if any). If missing, the default is 
ncpus 
integer: number of processes to be used in parallel operation. 
cl 
An optional 
progress.bar 
Style for 
keep 
list of characters (not case sensitive) which determines additional output.

predictRisk.args 
A list of argumentlists to control how risks are predicted.
The names of the lists should be the S3classes of the A more flexible approach is to write a new predictRisk S3method. See Details. 
debug 
Logical. If 
useEventTimes 
obsolete. 
nullModel 
obsolete. 
censMethod 
obsolete. 
censModel 
obsolete. 
splitMethod 
obsolete. 
... 
Named list containging additional arguments that are passed on to the 
The function implements a toolbox for the risk prediction modeller: all tools work for the three outcomes: (1) binary (uncensored), (2) right censored time to event without competing risks, (3) right censored time to event with competing risks
Computed are the (timedependent) Brier score and the (timedependent) area under the ROC curve for a list of risk prediction models either in external validation data or in the learning data using bootstrap crossvalidation. The function optionally provides results for plotting (timepoint specific) ROC curves, for (timepoint specific) calibration curves and for (timepoint specific) retrospective boxplots.
For uncensored binary outcome the DelongDelong test is used to contrast AUC of rival models. In right censored survival data (with and without competing risks) the pvalues correspond to Wald tests based on standard errors obtained with an estimate of the influence function as described in detail in the appendix of Blanche et al. (2015).
This function works with one or multiple models that predict the risk of an event R(tX) for a subject characterized by predictors X at time t. With binary endpoints (outcome 0/1 without time component) the risk is simply R(X). In case of a survival object without competing risks the function still works with predicted event probabilities, i.e., R(tX)=1S(tX) where S(tX) is the predicted survival chance for subject X at time t.
The already existing predictRisk methods (see methods(predictRisk)) may not cover all models and methods
for predicting risks. But users can quickly extend the package as explained in detail in Mogensen et al. (2012) for
the predecessors pec::predictSurvProb
and pec::predictEventProb
which have been unified as
riskRegression::predictRisk
.
Bootstrap Crossvalidation (see also Gerds & Schumacher 2007 and Mogensen et al. 2012)
B=10, M (not specified or M=NROW(data))
Training of each of the models in each of 10 bootstrap data sets (learning data sets).
Learning data sets are obtained by sampling NROW(data)
subjects of the data set
with replacement. There are roughly .632*NROW(data)
subjects in the learning data (inbag)
and .368*NROW(data)
subjects not in the validation data sets (outofbag).
These are used to estimate the scores: AUC, Brier, etc. Reported are averages across the 10 splits.
## Bootstrap with replacement
set.seed(13)
N=17
data = data.frame(id=1:N, y=rbinom(N,1,.3),x=rnorm(N))
boot.index = sample(1:N,size=N,replace=TRUE)
boot.index
inbag = 1:N
outofbag = !inbag
learn.data = data[inbag]
val.data = data[outofbag]
riskRegression:::getSplitMethod("bootcv",B=10,N=17)$index
NOTE: the number .632 is the expected probability to draw one subject (for example subject 1) with
replacement from the data, which does not depend on the sample size:
B=10000
N=137
mean(sapply(1:B, function(b){match(1,sample(1:N,size=N,replace=TRUE),nomatch=0)}))
N=30
mean(sapply(1:B, function(b){match(1,sample(1:N,size=N,replace=TRUE),nomatch=0)}))
N=300
mean(sapply(1:B, function(b){match(1,sample(1:N,size=N,replace=TRUE),nomatch=0)}))
## Bootstrap without replacement (training size set to be 70 percent of data) B=10, M=.7
Training of each of the models in each of 10 bootstrap data sets (learning data sets).
Learning data sets are obtained by sampling round(.8*NROW(data))
subjects of the data set
without replacement. There are NROW(data)round(.8*NROW(data))
subjects not in the learning data sets.
These are used to estimate the scores: AUC, Brier, etc. Reported are averages across the 10 splits.
set.seed(13)
N=17
data = data.frame(id=1:N, y=rbinom(N,1,.3),x=rnorm(N))
boot.index = sample(1:N,size=M,replace=FALSE)
boot.index
inbag = 1:N
outofbag = !inbag
learn.data = data[inbag]
val.data = data[outofbag]
riskRegression:::getSplitMethod("bootcv",B=10,N=17,M=.7)$index
List with scores and assessments of contrasts, i.e.,
tests and confidence limits for performance and difference in performance (AUC and Brier),
summaries and plots. Most elements are indata.table
format.
Thomas A Gerds tag@biostat.ku.dk and Paul Blanche paul.blanche@univubs.fr
Ulla B. Mogensen, Hemant Ishwaran, Thomas A. Gerds (2012). Evaluating Random Forests for Survival Analysis Using Prediction Error Curves. Journal of Statistical Software, 50(11), 123. URL http://www.jstatsoft.org/v50/i11/.
Paul Blanche, Cecile ProustLima, Lucie Loubere, Claudine Berr, Jean Francois Dartigues, and Helene JacqminGadda. Quantifying and comparing dynamic predictive accuracy of joint models for longitudinal marker and timetoevent in presence of censoring and competing risks. Biometrics, 71 (1):102–113, 2015.
P. Blanche, JF Dartigues, and H. JacqminGadda. Estimating and comparing timedependent areas under receiver operating characteristic curves for censored event times with competing risks. Statistics in Medicine, 32(30):5381–5397, 2013.
E. Graf et al. (1999), Assessment and comparison of prognostic classification schemes for survival data. Statistics in Medicine, vol 18, pp= 2529–2545.
Efron, Tibshirani (1997) Journal of the American Statistical Association 92, 548–560 Improvement On CrossValidation: The .632+ Bootstrap Method.
Gerds, Schumacher (2006), Consistent estimation of the expected Brier score in general survival models with rightcensored event times. Biometrical Journal, vol 48, 1029–1040.
Thomas A. Gerds, Martin Schumacher (2007) EfronType Measures of Prediction Error for Survival Analysis Biometrics, 63(4), 1283–1287 doi:10.1111/j.15410420.2007.00832.x
Martin Schumacher, Harald Binder, and Thomas Gerds. Assessment of survival prediction models based on microarray data. Bioinformatics, 23(14):176874, 2007.
Mark A. van de Wiel, Johannes Berkhof, and Wessel N. van Wieringen Testing the prediction error difference between 2 predictors Biostatistics (2009) 10(3): 550560 doi:10.1093/biostatistics/kxp011
Michael W Kattan and Thomas A Gerds. The index of prediction accuracy: an intuitive measure useful for evaluating risk prediction models. Diagnostic and Prognostic Research, 2(1):7, 2018.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154  # binary outcome
library(lava)
set.seed(18)
learndat < sampleData(48,outcome="binary")
testdat < sampleData(40,outcome="binary")
## score logistic regression models
lr1 = glm(Y~X1+X2+X7+X9,data=learndat,family=binomial)
lr2 = glm(Y~X3+X5,data=learndat,family=binomial)
Score(list("LR(X1+X2+X7+X9)"=lr1,"LR(X3+X5)"=lr2),formula=Y~1,data=testdat)
## ROC curve and calibration plot
xb=Score(list("LR(X1+X2+X7+X9)"=lr1,"LR(X3+X5+X6)"=lr2),formula=Y~1,
data=testdat,plots=c("calibration","ROC"))
## Not run: plotROC(xb)
plotCalibration(xb)
## End(Not run)
## compute AUC for a list of continuous markers
markers = as.list(testdat[,.(X6,X7,X8,X9,X10)])
Score(markers,formula=Y~1,data=testdat,metrics=c("auc"))
# crossvalidation
## Not run:
learndat=sampleData(400,outcome="binary")
lr1a = glm(Y~X6,data=learndat,family=binomial)
lr2a = glm(Y~X7+X8+X9,data=learndat,family=binomial)
## bootstrap crossvalidation
x1=Score(list("LR1"=lr1a,"LR2"=lr2a),formula=Y~1,data=learndat,split.method="bootcv",B=100)
x1
## leaveoneout and leavepairout bootstrap
x2=Score(list("LR1"=lr1a,"LR2"=lr2a),formula=Y~1,data=learndat,
split.method="loob",
B=100,plots="calibration")
x2
## End(Not run)
# survival outcome
# Score Cox regression models
## Not run: library(survival)
library(rms)
library(prodlim)
set.seed(18)
trainSurv < sampleData(100,outcome="survival")
testSurv < sampleData(40,outcome="survival")
cox1 = coxph(Surv(time,event)~X1+X2+X7+X9,data=trainSurv, y=TRUE, x = TRUE)
cox2 = coxph(Surv(time,event)~X3+X5+X6,data=trainSurv, y=TRUE, x = TRUE)
xs=Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=testSurv,conf.int=FALSE,times=c(5,8))
xs
## End(Not run)
# Integrated Brier score
## Not run:
xs=Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=testSurv,conf.int=FALSE,
summary="ibs",
times=sort(unique(testSurv$time)))
## End(Not run)
# timedependent AUC for list of markers
## Not run: survmarkers = as.list(testSurv[,.(X6,X7,X8,X9,X10)])
Score(survmarkers,
formula=Surv(time,event)~1,metrics="auc",data=testSurv,
conf.int=TRUE,times=c(5,8))
# compare models on test data
Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=testSurv,conf.int=TRUE,times=c(5,8))
## End(Not run)
# crossvalidation models in traindata
## Not run:
library(survival)
set.seed(18)
trainSurv < sampleData(400,outcome="survival")
cox1 = coxph(Surv(time,event)~X1+X2+X7+X9,data=trainSurv, y=TRUE, x = TRUE)
cox2 = coxph(Surv(time,event)~X3+X5+X6,data=trainSurv, y=TRUE, x = TRUE)
x1 = Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=trainSurv,conf.int=TRUE,times=c(5,8),
split.method="loob",B=100,plots="calibration")
x2= Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=trainSurv,conf.int=TRUE,times=c(5,8),
split.method="bootcv",B=100)
## End(Not run)
# restrict number of comparisons
## Not run:
Score(list("Cox(X1+X2+X7+X9)"=cox1,"Cox(X3+X5+X6)"=cox2),
formula=Surv(time,event)~1,data=trainSurv,contrasts=TRUE,
null.model=FALSE,conf.int=TRUE,times=c(5,8),split.method="bootcv",B=3)
# competing risks outcome
set.seed(18)
trainCR < sampleData(40,outcome="competing.risks")
testCR < sampleData(40,outcome="competing.risks")
library(riskRegression)
library(cmprsk)
# Causespecific Cox regression
csc1 = CSC(Hist(time,event)~X1+X2+X7+X9,data=trainCR)
csc2 = CSC(Hist(time,event)~X3+X5+X6,data=trainCR)
# FineGray regression
fgr1 = FGR(Hist(time,event)~X1+X2+X7+X9,data=trainCR,cause=1)
fgr2 = FGR(Hist(time,event)~X3+X5+X6,data=trainCR,cause=1)
Score(list("CSC(X1+X2+X7+X9)"=csc1,"CSC(X3+X5+X6)"=csc2,
"FGR(X1+X2+X7+X9)"=fgr1,"FGR(X3+X5+X6)"=fgr2),
formula=Hist(time,event)~1,data=testCR,se.fit=1L,times=c(5,8))
## End(Not run)
## Not run:
# reproduce some results of Table IV of Blanche et al. Stat Med 2013
data(Paquid)
ResPaquid < Score(list("DSST"=Paquid$DSST,"MMSE"=Paquid$MMSE),
formula=Hist(time,status)~1,
data=Paquid,
null.model = FALSE,
conf.int=TRUE,
metrics=c("auc"),
times=c(3,5,10),
plots="ROC")
ResPaquid
plotROC(ResPaquid,time=5)
## End(Not run)
## Not run:
# parallel options
# by erikvona: Here is a generic example of using future
# and doFuture, works great with the current version:
library(riskRegression)
library(future)
library(foreach)
library(doFuture)
library(survival)
# Register all available cores for parallel operation
plan(multiprocess, workers = availableCores())
registerDoFuture()
trainSurv < sampleData(400,outcome="survival")
cox1 = coxph(Surv(time,event)~X1+X2+X7+X9,data=trainSurv,
y=TRUE, x = TRUE)
# Bootstrapping on multiple cores
x1 = Score(list("Cox(X1+X2+X7+X9)"=cox1),
formula=Surv(time,event)~1,data=trainSurv, times=c(5,8),
parallel = "as.registered", split.method="bootcv",B=100)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.