knitr::opts_chunk$set(fig.alt = "Plot generated in survkl vignette")
Accurate prognosis prediction plays a central role in survival analysis. The recent growth of large-scale biobank studies—characterized by extensive genetic, transcriptomic, imaging, and electronic health record data—has created new opportunities to enhance prediction accuracy. However, despite their potential, these datasets often suffer from limited effective sample sizes, high dimensionality, low signal-to-noise ratios, and practical constraints such as patient privacy.
Integrating external information is a promising strategy for improving model performance, yet classical data integration methods typically assume homogeneous data-generating mechanisms across sources. This assumption rarely holds in modern biomedical applications. Ignoring heterogeneity can introduce substantial bias, underscoring the need for transfer-learning approaches that explicitly account for discrepancies across populations.
While KL divergence has been used previously for integrating binary-outcome models, extending such ideas to censored time-to-event outcomes is nontrivial due to censoring and the limited nature of available external survival information (e.g., predicted risk scores or sets of coefficients without a baseline hazard).
The survkl package addresses these challenges by offering a transfer-learning framework for Cox models that integrates external information with newly collected survival data.
It is designed to be flexible, privacy-preserving, and effective across both low- and high-dimensional settings.
Key features include:
Transfer Learning with KL Integration:
Incorporates various types of external information via KL-based penalization to adaptively borrow strength.
Privacy-Preserving External Use:
Requires only summary-level external inputs—such as risk scores or coefficient estimates—without access to individual-level external data.
Robustness to Heterogeneity:
Explicitly accounts for population differences through adaptive weighting controlled by tuning parameters (eta), enabling selective borrowing from compatible sources.
High-Dimensional Modeling Support:
Implements regularized KL-integrated Cox models, including
coxkl_ridge) Elastic Net (coxkl_enet, including LASSO as a special case)
Flexible External Information Formats:
Supports external coefficients, risk scores, and partially overlapping predictor sets.
Comprehensive Cross-Validation Framework:
Provides multiple tuning criteria including
This vignette introduces the main functionalities of survkl and provides examples for both low- and high-dimensional modeling workflows.
You can install from CRAN:
install.packages("survkl")
Or install the development version of survkl from GitHub:
require(devtools) require(remotes) remotes::install_github("UM-KevinHe/survkl", ref = "main")
This section provides a brief overview of the main functions using example datasets included in the package.
First, load the package:
library(survkl)
The low-dimensional KL-integrated Cox model, implemented in coxkl, is designed for settings where the number of predictors is modest.
The method incorporates external information—either in the form of external Cox coefficients (beta) or pre-computed external risk scores (RS)—using a KL-based penalization.
The tuning parameter eta governs the degree of borrowing:
- eta = 0 reproduces the standard Cox model;
- larger eta values increasingly pull the fitted coefficients toward the external information.
In this section we demonstrate how to fit a KL-integrated Cox model and how to select the optimal eta using cross-validation via cv.coxkl.
We illustrate the workflow using the built-in low-dimensional simulated dataset:
data(ExampleData_lowdim) train <- ExampleData_lowdim$train test <- ExampleData_lowdim$test z <- train$z delta <- train$status time <- train$time strat <- train$stratum
and externally derived coefficients beta_external:
beta_ext <- ExampleData_lowdim$beta_external_good
We generate a sequence of eta values through the internal utility generate_eta(), and fit the KL-integrated model across this grid:
eta_grid <- generate_eta(method = "exponential", n = 100, max_eta = 30) fit_lowdim <- coxkl( z = z, delta = delta, time = time, stratum = strat, beta = beta_ext, etas = eta_grid )
The S3 method coef() provides a clean interface to extract estimated coefficients. If the requested eta is between fitted values, linear interpolation is performed:
coef(fit_lowdim, eta = 1)
Users may instead supply an external risk score vector:
RS_ext <- as.matrix(z) %*% as.matrix(beta_ext) fit_lowdim_RS <- coxkl( z = z, delta = delta, time = time, stratum = strat, RS = RS_ext, etas = eta_grid ) coef(fit_lowdim_RS)[1:5]
Objects of class coxkl can be visualized using the S3 plotting method plot().
This function displays how model performance changes across the eta–sequence
used during fitting.
Two types of performance criteria are supported:
"loss"
(default; −2 × partial log-likelihood, normalized by sample size)
"CIndex"
(stratified concordance index)
If no test data are supplied, performance is computed using the training data
stored in object$data:
plot( fit_lowdim, test_z = test$z, test_time = test$time, test_delta = test$status, test_stratum = test$stratum, criteria = "loss" )
The function cv.coxkl performs K-fold (default 5) cross-validation to choose the integration parameter.
It supports four criteria:
"V&VH" — V&VH loss "LinPred" — predicted partial deviance "CIndex_pooled" — pooled comparable pairs "CIndex_foldaverage" — per-fold stratified C-index Below is an example using the default "V&VH" criterion:
cv_lowdim <- cv.coxkl( z = z, delta = delta, time = time, stratum = strat, beta = beta_ext, etas = eta_grid, nfolds = 5, criteria = "V&VH", seed = 1)
The cross-validated performance curve can be visualized using cv.plot():
cv.plot(cv_lowdim)
The solid purple curve displays the cross-validated loss across different values of eta.
The green dotted horizontal line marks the internal baseline at eta = 0, representing the model that does not incorporate external information.
The vertical dashed orange line indicates the optimal eta value, where the cross-validated loss is minimized.
A comparison between the purple curve and the green baseline shows whether borrowing external information improves prediction performance.
Whenever the purple curve falls below the green line, using external information (eta > 0) yields better predictive accuracy than relying solely on the internal model.
In high-dimensional settings (for example, when the number of predictors is comparable to or exceeds the sample size),
the survkl package extends KL-integrated Cox modeling with regularization. Two families of penalties are supported:
coxkl_ridge, which shrinks all coefficients toward zero while retaining dense solutions.coxkl_enet, which includes the LASSO (pure L1) as a special case when the mixing parameter is set to 1.Both models combine:
eta, to borrow information from external sources (risk scores RS or coefficients beta), and In this section we first introduce the shared high-dimensional example dataset, and then illustrate the usage of
the ridge-penalized model coxkl_ridge. (The elastic net workflows are analogous and will be discussed in a separate subsection.)
We use the built-in high-dimensional simulated dataset:
data(ExampleData_highdim) train_hd <- ExampleData_highdim$train test_hd <- ExampleData_highdim$test z_hd <- train_hd$z delta_hd <- train_hd$status time_hd <- train_hd$time strat_hd <- train_hd$stratum
This dataset contains 50 predictors (Z1–Z50) with 6 signal variables and 44 AR(1) noise variables.
Externally derived coefficients are provided in beta_external:
beta_external_hd <- ExampleData_highdim$beta_external
These external coefficients are estimated from a separate dataset using only Z1–Z6 and then expanded to a length-50 vector,
with zeros for Z7–Z50.
coxkl_ridge)The function coxkl_ridge fits a KL-integrated Cox model with a ridge (L2) penalty on all predictors.
External information is incorporated through a KL term weighted by eta (an user-specified scalar), while the ridge penalty is controlled by a
sequence of tuning parameters lambda. If lambda is not provided, a decreasing lambda path is generated automatically.
We first fit a KL–ridge model for a fixed integration weight eta and an automatically generated lambda path:
model_ridge <- coxkl_ridge( z = z_hd, delta = delta_hd, time = time_hd, stratum = strat_hd, beta = beta_external_hd, # external coefficients (length 50) eta = 1 # KL integration weight )
The fitted object stores, for each lambda value:
model_ridge$lambda — the lambda sequence (in decreasing order),model_ridge$beta — estimated coefficients (one column per lambda),model_ridge$linear.predictors — linear predictors for all observations and all lambda values,model_ridge$likelihood — partial log-likelihood along the lambda path,model_ridge$data — the data used for fitting.The S3 method coef() extracts the estimated coefficients:
# All lambdas (columns ordered in decreasing lambda) coef(model_ridge)[1:5, 1:5] # first 5 lambdas
To focus on a specific value of lambda:
lambda_target <- model_ridge$lambda[5] coef(model_ridge, lambda = lambda_target)[1:5]
If the requested lambda is not exactly one of the fitted values, coef() performs linear interpolation along the lambda path.
Objects of class coxkl_ridge can be visualized using the S3 plotting method plot():
By default, this plots (at given eta):
-2 * partial log-likelihood) versus the penalty parameter lambda,plot( model_ridge, test_z = test_hd$z, test_time = test_hd$time, test_delta = test_hd$status, test_stratum = test_hd$stratum, criteria = "CIndex" )
The function cv.coxkl_ridge performs K-fold cross-validation to tune the integration parameter eta
while internally scanning over a lambda path for each candidate eta. For each eta, it:
lambda,lambda for that eta,Supported criteria are:
"V&VH" — V&VH loss (reported as Loss = -2 * score),"LinPred" — predicted partial deviance,"CIndex_pooled" — pooled C-index over all folds,"CIndex_foldaverage" — average C-index across folds.Example: tuning eta using 5-fold cross-validation and the "V&VH" criterion:
eta_grid_hd <- generate_eta(method = "exponential", n = 50, max_eta = 100) cv_ridge_hd <- cv.coxkl_ridge( z = z_hd, delta = delta_hd, time = time_hd, stratum = strat_hd, beta = beta_external_hd, etas = eta_grid_hd, nfolds = 5, cv.criteria = "V&VH", seed = 1)
The best lambda for each eta (according to the chosen criterion) is provided by:
cv_ridge_hd$integrated_stat.best_per_eta
As with low-dimensional models, the helper function cv.plot() can be used to visualize performance versus eta:
cv.plot(cv_ridge_hd)
The plot shows:
eta sequence (loss or C-index),eta = 0,coxkl_enet)The function coxkl_enet fits a KL-integrated Cox model with an elastic-net penalty, controlled by the
mixing parameter alpha. When alpha = 1, the penalty reduces to LASSO, enabling coefficient sparsity in addition to KL-based integration of external information.
External knowledge may be incorporated either through external coefficients (beta) or an
externally computed risk score (RS). The integration weight eta determines how strongly the model
borrows from this external signal, while the penalty parameter lambda controls the sparsity level.
If lambda is not supplied, the function automatically generates a decreasing lambda sequence.
We illustrate the workflow using LASSO (alpha = 1) with an automatically generated lambda path:
model_enet <- coxkl_enet( z = z_hd, delta = delta_hd, time = time_hd, stratum = strat_hd, beta = beta_external_hd, eta = 1, alpha = 1 # LASSO penalty )
The fitted object stores, for each lambda value:
model_enet$lambda — the lambda sequence (in decreasing order),model_enet$beta — estimated coefficients (one column per lambda),model_enet$likelihood — partial log-likelihood along the lambda path,model_enet$data — the data used for fitting.The S3 method coef() extracts the estimated coefficients:
coef(model_enet)[1:5, 1:5]
To extract coefficients corresponding to a specific lambda:
lambda_target <- model_enet$lambda[5] coef(model_enet, lambda = lambda_target)[1:5]
Objects of class coxkl_enet can be visualized using the S3 method plot(), and default is to plot loss versus lambda:
plot( model_enet, test_z = test_hd$z, test_time = test_hd$time, test_delta = test_hd$status, test_stratum = test_hd$stratum, criteria = "loss" )
Similar, the function cv.coxkl_enet extends the above fitting procedure by performing K-fold cross-validation over a supplied grid of eta values:
eta_grid_hd <- generate_eta(method = "exponential", n = 50, max_eta = 100) cv_enet_hd <- cv.coxkl_enet( z = z_hd, delta = delta_hd, time = time_hd, stratum = strat_hd, beta = beta_external_hd, etas = eta_grid_hd, alpha = 1, # LASSO nfolds = 5, cv.criteria = "V&VH", seed = 1 )
CV results can be visualized using cv.plot():
cv.plot(cv_enet_hd)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.