calibrationPlot: Calibration Plots
In discSurv: Discrete Time Survival Analysis

calibrationPlot

R Documentation

Calibration Plots

Description

Calibration plot based on predictions. Overall root mean squared error (RMSE) of predicted and observed discrete hazards is calculated.

Usage

calibrationPlot(
  testPreds,
  testDataLong,
  weights = NULL,
  K = 10,
  event = "e1",
  ...
)

Arguments

`testPreds`	Predictions on the validation data with model fitted on training data ("numeric vector").
`testDataLong`	Validation data set in long format ("class data.frame").
`weights`	optional vector of weights ("numeric vector"). The length of weights must be equal to the number of observations of the validation data set.
`K`	Number of subsets for plotting ("integer vector").
`event`	Column names of the event to be considered for plotting (only in case of cause-specific hazards) ("character vector").
`...`	Additional arguments passed to `plot`.

Value

Calibration plot

Author(s)

Moritz Berger moritz.berger@imbie.uni-bonn.de
https://www.imbie.uni-bonn.de/personen/dr-moritz-berger/

References

\insertRef

bergerTutorialdiscSurv

\insertRefheyardValCompRisksdiscSurv

\insertRefbergerAssessingdiscSurv

Examples

####################
# Data preprocessing

# Example unemployment data
library(Ecdat)
data(UnempDur)

# Select subsample
selectInd1 <- 1:100
selectInd2 <- 101:200
trainSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd1], ]
valSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd2], ]

####################
# One event

# Convert to long format
trainSet_long <- dataLong(dataShort = trainSet, timeColumn = "spell", eventColumn = "censor1")
valSet_long <- dataLong(dataShort = valSet, timeColumn = "spell", eventColumn = "censor1")

# Estimate continuation ratio model with logit link
glmFit <- glm(formula = y ~ timeInt + age + logwage, data = trainSet_long, family = binomial())

# Calculate predicted hazards
predHazards <- predict(glmFit, newdata = valSet_long, type = "response")

# Calibration plot
calibrationPlot(predHazards, testDataLong = valSet_long)

############################
# Two cause specific hazards 

# Convert to long format
trainSet_long <- dataLongCompRisks(dataShort = trainSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"))
valSet_long <- dataLongCompRisks(dataShort = valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"))

# Estimate continuation ratio model with logit link
vglmFit <- VGAM::vglm(formula = cbind(e0, e1, e2) ~ timeInt + age + logwage, data = trainSet_long, 
family = VGAM::multinomial(refLevel = "e0"))

# Calculate predicted hazards
predHazards <- VGAM::predictvglm(vglmFit, newdata = valSet_long, type = "response")

# Calibration plots
calibrationPlot(predHazards, testDataLong = valSet_long)
calibrationPlot(predHazards, testDataLong = valSet_long, event = "e2")

###############################
# Subdistribution hazards model

# Convert to long format
trainSet_long <- dataLongSubDist(dataShort = trainSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")
valSet_long <- dataLongSubDist(dataShort = valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")

# Estimate continuation ratio model with logit link
glmFit <- glm(formula = y ~ timeInt + age + logwage, data = trainSet_long, 
family = binomial(), weights = trainSet_long$subDistWeights)

# Calculate predicted  hazards 
predHazards <- predict(glmFit, newdata = valSet_long, type = "response")

# Calibration plot 
calibrationPlot(predHazards, testDataLong = valSet_long, weights = valSet_long$subDistWeights)

discSurv documentation built on March 18, 2022, 7:12 p.m.