predErrCurve: Prediction Error Curve

print.discSurvPredErrDiscR Documentation

Prediction Error Curve

Description

Estimates prediction error curves of arbitrary discrete survival prediction models. In prediction error curves the estimated and observed survival functions are compared adjusted by weights at given timepoints.

Usage

## S3 method for class 'discSurvPredErrDisc'
print(x, ...)

predErrCurve(
  hazards,
  timepoints,
  testTime,
  testEvent,
  trainTime,
  trainEvent,
  testObjLong
)

## S3 method for class 'discSurvPredErrDisc'
plot(x, ...)

Arguments

x

Object of class "discSurvPredErrDisc"

...

Specification of additional arguments in function plot.

hazards

Predicted discrete hazards in the test data (class "numeric").

timepoints

Vector of discrete time intervals on which the prediction error curve is calculated (class "integer").

testTime

Discrete survival times in the test data (class "numeric").

testEvent

Univariate event indicator in the test data (binary vector).

trainTime

Numeric vector of discrete survival times in the training data (class "numeric").

trainEvent

Integer vector of univariate event indicator in the training data (class "integer").

testObjLong

Independent observation identification numbers of test data in long format (integer vector). For example in medicine, this would be patient IDs. Each patient should have a unique identifier.

Details

The prediction error curves should be smaller than 0.25 for all time points, because this is equivalent to a random assignment error.

Value

  • List List with objects:

    • Output List with two components

      • predErr Numeric vector with estimated prediction error values. Names give the evaluation time point.

      • weights List of weights used in the estimation. Each list component gives the weights of a person in the test data.

    • Input A list of given argument input values (saved for reference)

Author(s)

Thomas Welchowski t.welchowski@psychologie.uzh.ch

References

\insertRef

gerdsConsisEstdiscSurv

\insertReflaanUniCensordiscSurv

See Also

intPredErr, predErrCurveCompRisks

Examples


# Example with cross validation and unemployment data 
library(Ecdat)
library(mgcv)
data(UnempDur)
summary(UnempDur$spell)

# Extract subset of data
set.seed(635)
IDsample <- sample(1:dim(UnempDur)[1], 100)
UnempDurSubset <- UnempDur [IDsample, ]
head(UnempDurSubset)
range(UnempDurSubset$spell)

# Generate training and test data
set.seed(7550)
TrainIndices <- sample (x = 1:dim(UnempDurSubset) [1], size = 75)
TrainUnempDur <- UnempDurSubset [TrainIndices, ]
TestUnempDur <- UnempDurSubset [-TrainIndices, ]

# Convert to long format
LongTrain <- dataLong(dataShort = TrainUnempDur, timeColumn = "spell", eventColumn = "censor1")
LongTest <- dataLong(dataShort = TestUnempDur, timeColumn = "spell", eventColumn = "censor1")
# Convert factor to numeric for smoothing
LongTrain$timeInt <- as.numeric(as.character(LongTrain$timeInt))
LongTest$timeInt <- as.numeric(as.character(LongTest$timeInt))

######################################################################
# Estimate a generalized, additive model in discrete survival analysis

gamFit <- gam (formula = y ~ s(timeInt) + age + logwage, data = LongTrain, family = binomial())

# Predict hazard rates on test data
predHaz <- predict(gamFit, newdata = LongTest, type = "response")

# Prediction error in first interval
tryPredErrDisc1 <- predErrCurve(hazards=predHaz, timepoints = 1, 
testTime = TestUnempDur$spell,
testEvent=TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
 trainEvent=TrainUnempDur$censor1, testObjLong=LongTest$obj)
tryPredErrDisc1

# Prediction error of the 2. to 10. interval
tryPredErrDisc2 <- predErrCurve(hazards=predHaz, timepoints = 2:10,
testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
trainEvent = TrainUnempDur$censor1, testObjLong=LongTest$obj)
tryPredErrDisc2
plot(tryPredErrDisc2)

########################################
# Fit a random discrete survival forest

library(ranger)
LongTrainRF <- LongTrain
LongTrainRF$y <- factor(LongTrainRF$y)
rfFit <- ranger(formula = y ~ timeInt + age + logwage, data = LongTrainRF,
probability = TRUE)

# Predict hazards on test data
predHaz <- predict(rfFit, data = LongTest)$predictions[, 2]

# Prediction error in first interval
tryPredErrDisc1 <- predErrCurve(hazards=predHaz, timepoints = 1, 
testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
 trainEvent = TrainUnempDur$censor1, testObjLong=LongTest$obj)
tryPredErrDisc1

# Prediction error of the 2. to 10. interval
tryPredErrDisc2 <- predErrCurve(hazards=predHaz, timepoints = 2:10,
testTime = TestUnempDur$spell,
testEvent = TestUnempDur$censor1, trainTime = TrainUnempDur$spell,
trainEvent = TrainUnempDur$censor1, testObjLong=LongTest$obj)
tryPredErrDisc2
plot(tryPredErrDisc2)


discSurv documentation built on April 29, 2026, 9:07 a.m.