intPredErr: Integrated Prediction Error

intPredErrR Documentation

Integrated Prediction Error

Description

Computes the integrated prediction error curve for discrete survival models.

Usage

intPredErr(
  hazards,
  testTime,
  testEvent,
  trainTime,
  trainEvent,
  testObjLong,
  tmax = NULL
)

Arguments

hazards

Predicted discrete hazards in the test data (class "numeric").

testTime

Discrete time intervals in short format of the test set (class "integer").

testEvent

Events in short format in the test set (binary vector).

trainTime

Discrete time intervals in short format of the training data set (class "integer").

trainEvent

Events in short format in the training set (binary vector).

testObjLong

Independent observation identification numbers of test data in long format (integer vector). For example in medicine, this would be patient IDs. Each patient should have a unique identifier.

tmax

Gives the maximum time interval for which prediction errors are calculated (class "integer"). It must be smaller than the maximum observed time in the training data of the object produced by function. The default setting NULL means, that all observed intervals are used.

Value

Integrated prediction error (class "numeric").

Author(s)

Thomas Welchowski t.welchowski@psychologie.uzh.ch

References

\insertRef

gneitingPropScorediscSurv

\insertReftutzModelDiscdiscSurv

See Also

predErrCurve, aggregate

Examples


##########################
# Example with cancer data

library(survival)
head(cancer)

# Data preparation and convertion to 30 intervals
cancerPrep <- cancer
cancerPrep$status <- cancerPrep$status-1
intLim <- quantile(cancerPrep$time, prob = seq(0, 1, length.out = 30))
intLim [length(intLim)] <- intLim [length(intLim)] + 1

# Cut discrete time in smaller number of intervals
cancerPrep <- contToDisc(dataShort = cancerPrep, timeColumn = "time", intervalLimits = intLim)

# Generate training and test data
set.seed(753)
TrainIndices <- sample (x = 1:dim(cancerPrep) [1], size = dim(cancerPrep) [1] * 0.75)
TrainCancer <- cancerPrep [TrainIndices, ]
TestCancer <- cancerPrep [-TrainIndices, ]
TrainCancer$timeDisc <- as.numeric(as.character(TrainCancer$timeDisc))
TestCancer$timeDisc <- as.numeric(as.character(TestCancer$timeDisc))

# Convert to long format
LongTrain <- dataLong(dataShort = TrainCancer, timeColumn = "timeDisc", eventColumn = "status",
timeAsFactor=FALSE)
LongTest <- dataLong(dataShort = TestCancer, timeColumn = "timeDisc", eventColumn = "status",
timeAsFactor=FALSE)
# Convert factors
LongTrain$timeInt <- as.numeric(as.character(LongTrain$timeInt))
LongTest$timeInt <- as.numeric(as.character(LongTest$timeInt))
LongTrain$sex <- factor(LongTrain$sex)
LongTest$sex <- factor(LongTest$sex)

# Estimate, for example, a generalized, additive model in discrete survival analysis
library(mgcv)
gamFit <- gam (formula = y ~ s(timeInt) + s(age) + sex + ph.ecog, data = LongTrain, 
family = binomial())
summary(gamFit)

# 1. Specification of predicted discrete hazards
# Estimate survival function of each person in the test data
testPredHaz <- predict(gamFit, newdata = LongTest, type = "response")

# 2. Calculate integrated prediction error
intPredErr(hazards = testPredHaz, 
testTime = TestCancer$timeDisc, testEvent = TestCancer$status, 
trainTime = TrainCancer$timeDisc, trainEvent = TrainCancer$status, 
testObjLong = LongTest$obj)


discSurv documentation built on April 29, 2026, 9:07 a.m.