calPlot: Calibration Plot
In discSurv: Discrete Time Survival Analysis

calPlot

R Documentation

Calibration Plot

Description

Generates a plot to assess calibration of discrete survival models visually. Estimated hazard rates are grouped based on empirical quantiles and compared to observed hazard rates. For a well calibrated model the weighted mean of observed events should match the weighted mean of observed hazards.

Usage

calPlot(
  estHazards,
  testDataLong,
  nGroups = "auto",
  weights = NULL,
  laplaceSmoothPrior = 0,
  ...
)

Arguments

`estHazards`	Estimated hazards for all observations of data set in longe format (class "numeric"). In case of cause-specific competings risks each column corresponds to the estimated hazards for one cause (class c("matrix" "array")).
`testDataLong`	Test data in long format to asses calibration (class "data.frame").
`nGroups`	Number of groups for partitioning of estimated hazards (class "character" or class "numeric"). Default value "auto" uses a heuristic to determine the number of groups.
`weights`	Optional weights for each observation in long format (class "integer"). Default value corresponds to equal weights of one for each observation.
`laplaceSmoothPrior`	Specifies the prior assumption to apply additive Laplace smoothing or estimated hazards (class "numeric). It assumes a prior Bernoulli distribution with probability 1/2 to smooth estimated hazards in argument estHazards and observed events obsEvents. The smoothed hazard rates are between estimated and theoretical values. Default value of zero corresponds to no smoothing. Higher values give more weight to the prior distribution.
`...`	Specification of further arguments passed to function `plot`.

Details

The calibration plot can be calculated for training or test data. The number of groups of the compared hazard rates are determined by the heuristic Sturges rule, modified to include skewness, based on theory of information coding.

Note

The calibration plot assumes that the data was preprocessed to long data format. In case of subdistribution models the mean is weighted by the supplied weights. In case of cause-specific competing risks each event is plotted separately.

Author(s)

Thomas Welchowski and Moritz Berger

References

\insertRef

bergerAssessingdiscSurv

\insertRefcalPlotHeuristicdiscSurv

Examples


###################################
# Data preprocessing

# Example with unemployment data
library(Ecdat)
data(UnempDur)

# Select subsample
SubUnempDurTrain <- UnempDur[1:250, ]
SubUnempDurTest <- UnempDur[251:500, ]

# Transformation to long format
SubUnempDurTest_Long_TimeFactor <- dataLong(
 dataShort=SubUnempDurTest, timeColumn="spell", 
 eventColumn="censor1", timeAsFactor=TRUE)
SubUnempDurTest_LongSubDist <- dataLongSubDist(
 dataShort=SubUnempDurTest, timeColumn="spell", 
 eventColumns=c("censor1", "censor4"), eventFocus="censor1",
 timeAsFactor=TRUE)
SubUnempDurTest_LongCompRisks <- dataLongCompRisks(
 dataShort=SubUnempDurTest, timeColumn="spell", 
 eventColumns=c("censor1", "censor2"))

###############################################
# Calibration of plot of basic regression model 

# Estimate discrete survival continuation ratio model
estRegModel <- estReg(dataShort = SubUnempDurTrain, 
                     dataTransform = "dataLong", 
                     formulaVariable =~ timeInt + age + ui + logwage * ui, 
                     eventColumn = "censor1", timeColumn = "spell", timeAsFactor=TRUE)

preds <- predict(estRegModel, 
                newdata = SubUnempDurTest_Long_TimeFactor, type="response")
calPlot(estHazards=preds, 
       testDataLong=SubUnempDurTest_Long_TimeFactor)

###################################################
# Calibration plot of subdistribution hazards model

# Estimation of subdistribution hazard model
estRegModel <- estRegSubDist(dataShort = SubUnempDurTrain, 
                            formulaVariable =~ timeInt + age + ui + logwage * ui, 
                            eventColumns = c("censor1", "censor2", "censor3", "censor4"), 
                            eventFocus="censor1", timeColumn = "spell", timeAsFactor=TRUE)

# Visualization of calibration with test data
preds <- predict(estRegModel, 
               newdata = SubUnempDurTest_LongSubDist, type="response")
subDistWtest <- dataLongSubDist(dataShort=SubUnempDurTest, 
                               timeColumn="spell", eventColumns=c("censor1", "censor4"), 
                              eventFocus="censor1")$subDistWeights
calPlot(estHazards=preds, 
               testDataLong=SubUnempDurTest_LongSubDist, 
               weights=subDistWtest)

#################################################
# Calibration plot cause-specific competing risks

# Estimation
estRegModel <- estRegSmoothCompRisks(dataShort=SubUnempDurTrain, 
                                    dataTransform = "dataLongCompRisks",
                                    formulaVariable =~ s(timeInt) + age + ui + logwage * ui, 
                                    timeColumn="spell", eventColumns=c("censor1", "censor4"))

# Visualization of calibration with test data
preds <- predict(estRegModel, 
                newdata = SubUnempDurTest_LongCompRisks, type="response")
calPlot(estHazards=preds, 
       testDataLong=SubUnempDurTest_LongCompRisks)

discSurv documentation built on April 29, 2026, 9:07 a.m.