intPredErr | R Documentation |
Computes the integrated prediction error curve for discrete survival models.
intPredErr( hazards, testTime, testEvent, trainTime, trainEvent, testDataLong, tmax = NULL )
hazards |
Predicted discrete hazards in the test data ("numeric vector"). |
testTime |
Discrete time intervals in short format of the test set ("integer vector"). |
testEvent |
Events in short format in the test set ("binary vector"). |
trainTime |
Discrete time intervals in short format of the training data set ("integer vector"). |
trainEvent |
Events in short format in the training set ("binary vector"). |
testDataLong |
Test data in long format("class data.frame"). The discrete survival function is
calculated based on the predicted hazards. It is assumed that the data was
preprocessed with a function with prefix "dataLong", see e. g.
|
tmax |
Gives the maximum time interval for which prediction errors are calculated ("integer vector"). It must be smaller than the maximum observed time in the training data of the object produced by function. The default setting NULL means, that all observed intervals are used. |
Integrated prediction error ("numeric vector").
Thomas Welchowski welchow@imbie.meb.uni-bonn.de
tutzModelDiscdiscSurv
\insertRefgneitingPropScorediscSurv
predErrCurve
, aggregate
########################## # Example with cancer data library(survival) head(cancer) # Data preparation and convertion to 30 intervals cancerPrep <- cancer cancerPrep$status <- cancerPrep$status-1 intLim <- quantile(cancerPrep$time, prob = seq(0, 1, length.out = 30)) intLim [length(intLim)] <- intLim [length(intLim)] + 1 # Cut discrete time in smaller number of intervals cancerPrep <- contToDisc(dataShort = cancerPrep, timeColumn = "time", intervalLimits = intLim) # Generate training and test data set.seed(753) TrainIndices <- sample (x = 1:dim(cancerPrep) [1], size = dim(cancerPrep) [1] * 0.75) TrainCancer <- cancerPrep [TrainIndices, ] TestCancer <- cancerPrep [-TrainIndices, ] TrainCancer$timeDisc <- as.numeric(as.character(TrainCancer$timeDisc)) TestCancer$timeDisc <- as.numeric(as.character(TestCancer$timeDisc)) # Convert to long format LongTrain <- dataLong(dataShort = TrainCancer, timeColumn = "timeDisc", eventColumn = "status", timeAsFactor=FALSE) LongTest <- dataLong(dataShort = TestCancer, timeColumn = "timeDisc", eventColumn = "status", timeAsFactor=FALSE) # Convert factors LongTrain$timeInt <- as.numeric(as.character(LongTrain$timeInt)) LongTest$timeInt <- as.numeric(as.character(LongTest$timeInt)) LongTrain$sex <- factor(LongTrain$sex) LongTest$sex <- factor(LongTest$sex) # Estimate, for example, a generalized, additive model in discrete survival analysis library(mgcv) gamFit <- gam (formula = y ~ s(timeInt) + s(age) + sex + ph.ecog, data = LongTrain, family = binomial()) summary(gamFit) # 1. Specification of predicted discrete hazards # Estimate survival function of each person in the test data testPredHaz <- predict(gamFit, newdata = LongTest, type = "response") # 2. Calculate integrated prediction error intPredErr(hazards = testPredHaz, testTime = TestCancer$timeDisc, testEvent = TestCancer$status, trainTime = TrainCancer$timeDisc, trainEvent = TrainCancer$status, testDataLong = LongTest)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.