ExampleData_highdim: Example high-dimensional survival data

ExampleData_highdimR Documentation

Example high-dimensional survival data

Description

A simulated survival dataset in a high-dimensional linear setting with 50 covariates (6 signals + 44 AR(1) noise), Weibull baseline hazard, and controlled censoring. Includes internal train/test sets, and an external-data–estimated coefficient vector.

Usage

data(ExampleData_highdim)

Format

A list containing the following elements:

train

A list with components:

z

Data frame of size n_\mathrm{train}\times 50 with covariates Z1Z50.

status

Vector of event indicators (1=event, 0=censored).

time

Numeric vector of observed times \min(T, C).

stratum

Vector of stratum labels (here all 1).

test

A list with the same structure as train, with size n_\mathrm{test}\times 50 for z.

beta_external

Numeric vector (length 50, named Z1Z50) of Cox coefficients estimated on an external dataset using only Z1Z6 and expanded to length 50 (zeros for Z7Z50).

Details

Data-generating mechanism:

  • Covariates: 50 variables with signals Z1Z6 and noise Z7Z50.

    • Z1, Z2 ~ bivariate normal with AR(1) correlation \rho=0.5.

    • Z3, Z4 ~ independent Bernoulli(0.5).

    • Z5 ~ N(2,1), Z6 ~ N(-2,1) (group indicator fixed at 1).

    • Z7Z50 ~ multivariate normal with AR(1) correlation \rho=0.5.

  • True coefficients: \beta = (0.3,-0.3,0.3,-0.3,0.3,-0.3,0,\ldots,0) (length 50).

  • Event times: Weibull baseline hazard h_0(t)=\lambda\nu\, t^{\nu-1} with \lambda=1, \nu=2. Given linear predictor \eta = Z^\top \beta, draw U\sim\mathrm{Unif}(0,1) and set

    T = \left(\frac{-\log U}{\lambda\, e^{\eta}}\right)^{1/\nu}.

  • Censoring: C\sim \mathrm{Unif}(0,\text{ub}) with ub tuned iteratively to achieve the target censoring rate (internal: 0.70; external: 0.50). Observed time is \min(T,C), status is \mathbf{1}\{T \le C\}.

  • External coefficients: Fit a Cox model Surv(time, status) ~ Z1 + ... + Z6 on the external data (Breslow ties), then place the estimated coefficients into a length-50 vector (zeros elsewhere).

Examples

data(ExampleData_highdim)

head(ExampleData_highdim$train$z)
table(ExampleData_highdim$train$status)
summary(ExampleData_highdim$train$time)

head(ExampleData_highdim$test$z)
table(ExampleData_highdim$test$status)
summary(ExampleData_highdim$test$time)


survkl documentation built on April 22, 2026, 1:08 a.m.