weightsLtoT: Compute Subdistribution Weights

View source: R/DiscSurvAuxiliary.R

weightsLtoTR Documentation

Compute Subdistribution Weights

Description

Function to compute new subdistribution weights for a test data set based on the estimated censoring survival function from a learning data set.

Usage

weightsLtoT(
  dataShortTrain,
  dataShortTest,
  timeColumn,
  eventColumns,
  eventFocus,
  eventColumnsAsFactor = FALSE
)

Arguments

dataShortTrain

Learning data in short format (class "data.frame").

dataShortTest

Test data in short format (class "data.frame").

timeColumn

Character specifying the column name of the observed event times (class "character"). It is required that the observed times are discrete (class "integer").

eventColumns

Character vector specifying the column names of the event indicators (class "logical")(excluding censoring events). It is required that a 0-1 coding is used for all events. The algorithm treats row sums of zero of all event columns as censored.

eventFocus

Column name of the event of interest, which corresponds to the type 1 event (class "character").

eventColumnsAsFactor

Should the argument eventColumns be interpreted as column name of a factor variable (class "logical")? Default is FALSE.

Value

Subdstribution weights for the test data in long format using the estimated censoring survival function from the learning data (class "numeric"). The length of the vector is equal to the number of observations of the long test data.

Author(s)

Moritz Berger moritz.berger@zi-mannheim.de

References

\insertRef

bergerSubdistdiscSurv

\insertRefbergerAssessingdiscSurv

\insertRefzadehImpSubDistdiscSurv

See Also

dataLongSubDist, calPlot

Examples

####################
# Data preprocessing

# Example unemployment data
library(Ecdat)
data(UnempDur)

# Select subsample
selectInd1 <- 1:100
selectInd2 <- 101:200
trainSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd1], ]
valSet <- UnempDur[which(UnempDur$spell %in% (1:10))[selectInd2], ]  

# Convert to long format
trainSet_long <- dataLongSubDist(dataShort = trainSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")
valSet_long <- dataLongSubDist(dataShort = valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")

# Compute new weights of the validation data set 
valSet_long$subDistWeights <- weightsLtoT(trainSet, valSet, timeColumn = "spell", 
eventColumns = c("censor1", "censor4"), eventFocus = "censor1")

# Estimate continuation ratio model with logit link
glmFit <- glm(formula = y ~ timeInt + age + logwage, data = trainSet_long, 
family = binomial(), weights = trainSet_long$subDistWeights)

# Calculate predicted discrete hazards 
predHazards <- predict(glmFit, newdata = valSet_long, type = "response")

# Calibration plot 
calPlot(predHazards, testDataLong = valSet_long, weights = valSet_long$subDistWeights)


discSurv documentation built on April 29, 2026, 9:07 a.m.