auct.rhf: Time-varying AUC(t) and iAUC for Random Hazard Forests (TDC)

View source: R/auct.rhf.R

auct.rhfR Documentation

Time-varying AUC(t) and iAUC for Random Hazard Forests (TDC)

Description

Compute the time-varying AUC(t) and its time-aggregated summaries for Random Hazard Forests (RHF) with time-dependent covariates (TDC).

Usage

auct.rhf(
  object,
  marker = c("cumhaz", "hazard", "chf", "haz"),
  method = c("cumulative", "incident"),
  tau = NULL,
  riskset = c("subject", "record"),
  min.controls = 25,
  nfrac.controls = 0.10,
  min.cases = 1,
  g.floor = 0.10,
  g.floor.q = NULL,
  power = 2,
  ydata = NULL,
  winsor.q = NULL,
  eps = 1e-12,
  bootstrap.rep = 0L,
  bootstrap.refit = FALSE,
  bootstrap.conf = 0.95,
  bootstrap.seed = NULL,
  verbose = TRUE
)

## S3 method for class 'auct.rhf'
print(x, digits = 4, max.rows = 8, ...)
## S3 method for class 'auct.rhf'
plot(x, bass = 10, xlab = "Time", ylab = NULL,
         main = NULL, ylim = NULL, pch = 16, alpha = .05, ...)

Arguments

object

An RHF object. For grow/restore objects (class includes "rhf" and "grow"), OOB prediction matrices are used. For predict objects (class includes "rhf" and "predict"), test data is used (supply ydata= if the object does not include id/yvar).

marker

Which time-varying score to evaluate: "cumhaz" (alias "chf") or "hazard" (alias "haz").

method

Target estimand. "incident" (incident/dynamic) uses cases who fail at time t and controls who are at risk at t; this targets instantaneous hazard discrimination. "cumulative" (cumulative/dynamic) uses cases who have failed by t and controls who are event-free at t; this matches the more commonly used time-dependent AUC (Heagerty and Zheng, 2005).

tau

Optional time horizon. Evaluation is restricted to times t <= tau.

riskset

Risk-set definition for method = "incident": "subject" treats a subject as at risk when entry < t <= stop; "record" uses counting-process rows. Ignored when method = "cumulative".

min.controls

Minimum number of controls required at each evaluated time.

nfrac.controls

Minimum fraction of the total sample to be available as controls at each evaluated time.

min.cases

Minimum number of cases required at each evaluated time.

g.floor

Lower bound used to stabilize inverse-probability-of-censoring weights G(t) (global KM).

g.floor.q

Optional quantile-based floor (computed over the evaluation grid) to further protect tails.

power

Exponent in the Uno-style time weight, effectively using weights proportional to 1 / G(t)^{power}. The classical choice is power = 2.

ydata

Counting-process data frame (id, start, stop, event, ...). Required when object is a predict object that does not carry id/yvar.

winsor.q

Optional winsorization quantile (e.g., 0.99) applied to the time weights to minimize extreme influence.

eps

Small positive constant for numerical stability in weight calculations.

bootstrap.rep

Number of bootstrap replicates for standard errors. Default is 0 (no bootstrap).

bootstrap.refit

Logical. If FALSE (default), uses a subject-level pairs bootstrap with the fitted marker held fixed ("plug-in"). If TRUE, performs a full refit at each replicate, using the original RHF tuning parameters.

bootstrap.conf

Confidence level for pointwise normal bands on AUC(t) when bootstrap.rep > 0 (e.g., 0.95). Set to NA to suppress bands.

verbose

Logical; if TRUE, enable bootstrap progress messages. Messages can be silenced with suppressMessages().

bootstrap.seed

Optional integer seed used before bootstrap resampling for reproducibility.

x, digits, max.rows, bass, xlab, ylab, main, ylim, pch, alpha, ...

Standard arguments for the print and plot methods.

Details

What is estimated. At each evaluation time t, the function computes an AUC for the chosen marker (either cumulative hazard or hazard), using the specified case/control definition:

  • Incident/dynamic (method = "incident"): cases fail at t; controls are at risk at t.

  • Cumulative/dynamic (method = "cumulative"): cases have failed by t; controls are event-free at t.

The per-time AUC(t) values are combined in two ways:

  • iAUC.uno: an inverse-probability-of-censoring weighted (IPCW) average of AUC(t) over time, using a global KM estimate G(t) and a weight proportional to 1 / G(t)^{power} (Uno-style time weighting). Stabilization parameters include g.floor, g.floor.q, and winsor.q.

  • iAUC.std: a simple time-standardized mean of AUC(t) over the evaluation grid (trapezoidal rule divided by span). This quantity is sensitive to the time window and the shape of AUC(t) across time.

Markers. For method = "cumulative", using marker = "cumhaz" matches standard time-dependent AUCs. Using marker = "hazard" with "cumulative" often yields smaller AUC(t), as it targets instantaneous risk rather than cumulative risk.

Risk-set choice. When method = "incident" there are two ways to define controls at time t:

  • riskset = "subject": at risk when entry < t <= Tstop (one interval per subject). Fast and appropriate when subject rows tile continuous follow-up (no gaps).

  • riskset = "record": at risk when a counting-process row satisfies start < t <= stop. Exact under gaps but may yield fewer controls at sparse times.

If rows tile the entire follow-up, the two definitions coincide (up to ties). With gaps, "subject" can over-include controls (lower variance but a conceptual mismatch), while "record" is exact but can be sparser. For method = "cumulative" the risk-set choice is ignored.

Bootstrap. When bootstrap.rep > 0, two modes are available:

  • Plug-in (bootstrap.refit = FALSE): subject-level pairs bootstrap holding the fitted marker matrix fixed. Weights the within-time AUC(t) using resampled multiplicities and recomputes Uno's time weights via a weighted KM.

  • Refit (bootstrap.refit = TRUE): for each replicate, resample subjects, refit the RHF using the original tuning parameters, and re-evaluate AUC(t). The returned per-time SEs are matched to the original evaluation grid. For predict objects, the refit is performed using the data carried in the object; predictions are then recomputed for those data. This is computationally heavier but captures variability from model fitting.

The plot method draws a base-R shaded band for AUC(t) when bootstrap SEs are present.

Value

An object of class "auct.rhf" with elements:

  • AUC.by.time: data frame with columns time, AUC, n.cases, n.ctrl, G, W.

  • iAUC.uno: IPCW (Uno-style) time-averaged AUC.

  • iAUC.std: time-standardized mean of AUC(t).

  • marker, method, riskset, power, g.floor, g.floor.q, winsor.q, times: metadata.

  • diag.riskset: Quick diagnostic comparing control counts under riskset = "subject" vs "record" on a small subset of times (list with times, n.ctrl.subject, n.ctrl.record, n.diff, prop.times.different).

  • boot: present when bootstrap.rep > 0 with AUC.se (per-time SE matched to the reporting grid), iAUC.uno.se, iAUC.std.se, conf.level, AUC.lower, AUC.upper, rep, and mode ("plug-in" or "refit").

Generic methods print and plot are provided for compact summaries and visualization.

Author(s)

Hemant Ishwaran and Udaya B. Kogalur

References

Heagerty, P. J., Lumley, T., and Pepe, M. S. (2000). Time-dependent ROC curves for censored survival data and a diagnostic marker. Biometrics, 56(2), 337–344.

Heagerty, P. J., and Zheng, Y. (2005). Survival model predictive accuracy and ROC curves. Biometrics, 61(1), 92–105.

Uno, H., Tian, L., Cai, T., Kohane, I. S., Wei, L.-J. (2013). A unified inference procedure for a class of measures to assess improvement in risk prediction systems with survival data. Statistics in Medicine, 32(14), 2430–2442.

See Also

rhf, predict.rhf

Examples



## ------------------------------------------------------------
## Peak VO2 example
## ------------------------------------------------------------

data(peakVO2, package = "randomForestSRC")
d <- convert.counting(Surv(ttodead, died) ~ ., peakVO2)
f <- "Surv(id, start, stop, event) ~ ."

o <- rhf(f, d, ntree = 25, nodesize = 5)

## AUC(t) with cumulative/dynamic definition and cumhaz marker
a.chf <- auct.rhf(o, marker = "cumhaz", method = "cumulative")

## AUC(t) with incident/dynamic definition and hazard marker
a.haz <- auct.rhf(o, marker = "hazard", method = "incident")

print(a.chf)
print(a.haz)

oldpar <- par(mfrow = c(1, 2))
plot(a.chf, main = "AUC(t): cumulative + cumhaz")
plot(a.haz, main = "AUC(t): incident + hazard")
par(oldpar)

## ------------------------------------------------------------
##  TDC illustration with training/testing
## ------------------------------------------------------------

trn <- hazard.simulation(1)$dta
tst <- hazard.simulation(1)$dta
f <- "Surv(id, start, stop, event) ~ ."
o <- rhf(f, trn, ntree = 25)
p <- predict(o, tst)
a.trn <- auct.rhf(o)
a.tst <- auct.rhf(p)

oldpar <- par(mfrow = c(1, 2))
plot(a.trn, main = "AUC(t): chf marker, train", ylim = c(0.5,1))
plot(a.tst, main = "AUC(t): chf marker, test",  ylim = c(0.5,1))
par(oldpar)

## ------------------------------------------------------------
## Bootstrap SEs and shaded band
## ------------------------------------------------------------
d <- hazard.simulation(1)$dta
f <- "Surv(id, start, stop, event) ~ ."
o <- rhf(f, d)


oldpar <- par(mfrow = c(1, 2))


## plug-in bootstrap
a.bs1 <- auct.rhf(o, marker = "cumhaz", method = "cumulative",
                  bootstrap.rep = 20, bootstrap.seed = 123)
plot(a.bs1, main = "AUC(t) with bootstrap band (plug-in)")

## refit bootstrap (can be slow)
a.bs2 <- auct.rhf(o, marker = "cumhaz", method = "cumulative",
                  bootstrap.rep = 10, bootstrap.refit = TRUE, bootstrap.seed = 7)
plot(a.bs2, main = "AUC(t) with bootstrap band (refit)")


par(oldpar)


randomForestRHF documentation built on April 24, 2026, 1:07 a.m.