Benchmarking bigPLScox
In bigPLScox: Partial Least Squares for Cox Models with Big Matrices

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "figures/benchmarking-",
  fig.width = 7,
  fig.height = 5,
  dpi = 150,
  message = FALSE,
  warning = FALSE
)

LOCAL <- identical(Sys.getenv("LOCAL"), "TRUE")

Motivation

The bigPLScox package now provides three distinct engines for extracting PLS components in Cox regression: the original matrix-based solver (big_pls_cox), the new Armadillo backend (big_pls_cox_fast), and the stochastic gradient descent routine (big_pls_cox_gd). This vignette shows how to benchmark those implementations against coxgpls() and the standard survival::coxph() fit.

We rely on the bench package to collect timing and memory statistics. All chunks are wrapped in the LOCAL flag so that CRAN builds can skip the heavier computations.

Dependencies

needed <- c("bigmemory", "bench", "survival", "plsRcox", "bigPLScox")
has    <- vapply(needed, requireNamespace, logical(1), quietly = TRUE)

if (!all(has)) {
  missing <- paste(needed[!has], collapse = ", ")
  knitr::opts_chunk$set(eval = FALSE)
  message("Note: skipping code execution because these packages are missing: ", missing)
}

library(bigPLScox)
library(survival)
library(bench)
library(bigmemory)
library(plsRcox)

Simulated data

We reuse the helper dataCox() to generate survival outcomes with right censoring. Adjust n and p to stress-test the solvers on larger problems.

set.seed(2024)
sim_design <- dataCox(
  n = 2000,
  lambda = 2,
  rho = 1.5,
  x = matrix(rnorm(2000 * 50), ncol = 50),
  beta = c(1, 3, rep(0, 48)),
  cens.rate = 5
)

cox_data <- list(
  x = as.matrix(sim_design[, -(1:3)]),
  time = sim_design$time,
  status = sim_design$status
)

X_big <- bigmemory::as.big.matrix(cox_data$x)

Running the benchmark

Dense matrix scenario – compares the classical coxgpls() workflow and the dense variants of the big-memory solvers against survival::coxph().
Big-memory scenario – evaluates big_pls_cox_fast() and big_pls_cox_gd() directly on a big.matrix, illustrating the benefit of the shared-memory interface.

The number of iterations can be increased for more stable measurements when running locally.

Dense matrix comparison

bench_dense <- bench::mark(
  coxgpls = coxgpls(
    cox_data$x,
    cox_data$time,
    cox_data$status,
    ncomp = 5
  ),
  big_pls = big_pls_cox(cox_data$x, cox_data$time, cox_data$status, ncomp = 5),
  big_pls_fast = big_pls_cox_fast(cox_data$x, cox_data$time, cox_data$status, ncomp = 5),
  big_pls_bb = big_pls_cox_gd(X_big, cox_data$time, cox_data$status, ncomp = 5),
  big_pls_gd = big_pls_cox_gd(X_big, cox_data$time, cox_data$status, ncomp = 5, method = "gd"),
  coxph = coxph(Surv(cox_data$time, cox_data$status) ~ cox_data$x, ties = "breslow"),
  iterations = 75,
  check = FALSE
)
bench_dense$expression <- names(bench_dense$expression)
bench_dense[, c("expression", "median", "itr/sec", "mem_alloc")]

A higher itr/sec value indicates faster throughput. The fast backend should provide the best compromise between speed and accuracy in the dense setting.

Big-memory comparison

The next experiment reuses the big.matrix version of the predictors and adds the SGD solver. Because coxgpls() requires an in-memory matrix, it is not included in this round.

bench_big <- bench::mark(
  big_pls_fast = big_pls_cox_fast(X_big, cox_data$time, cox_data$status, ncomp = 5),
  big_pls = big_pls_cox(cox_data$x, cox_data$time, cox_data$status, ncomp = 5),
  big_pls_gd = big_pls_cox_gd(X_big, cox_data$time, cox_data$status, ncomp = 5, max_iter = 300),
  iterations = 75,
  check = FALSE
)
bench_big$expression <- names(bench_big$expression)
bench_big[, c("expression", "median", "itr/sec", "mem_alloc")]

The classical solver is still run on the dense matrix for reference, but only the fast and SGD routines operate on the big.matrix without copying. Expect the fast backend to dominate runtime while SGD maintains scalability for out-of-memory workflows.

Visualising the results

bench includes autoplot() and plot() helpers for visual inspection of the timing distribution.

plot(bench_dense, type = "jitter")

plot(bench_big, type = "jitter")

Additional geometries, such as ridge plots, are available via autoplot(bench_res, type = "jitter").

Exporting benchmark tables

To integrate the benchmark with automated pipelines, store the results under the inst/benchmarks/ directory. Each run can include meta-data such as dataset parameters, number of iterations, or git commit identifiers.

if (!dir.exists("inst/benchmarks/results")) {
  dir.create("inst/benchmarks/results", recursive = TRUE)
}
write.csv(bench_dense[, 1:9], file = "inst/benchmarks/results/benchmark-dense.csv", row.names = FALSE)
write.csv(bench_big[, 1:9], file = "inst/benchmarks/results/benchmark-big.csv", row.names = FALSE)

Additional scripts

The package also ships with standalone scripts under inst/benchmarks/ that mirror this vignette while exposing additional configuration points. Run them from the repository root as:

Rscript inst/benchmarks/cox-benchmark.R
Rscript inst/benchmarks/benchmark_bigPLScox.R
Rscript inst/benchmarks/cox_pls_benchmark.R

Each script accepts environment variables to adjust the problem size and stores results in inst/benchmarks/results/ with time-stamped filenames.

Any scripts or data that you put into this service are public.

bigPLScox documentation built on Nov. 18, 2025, 5:06 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bigPLScox
Partial Least Squares for Cox Models with Big Matrices

Benchmarking bigPLScox
In bigPLScox: Partial Least Squares for Cox Models with Big Matrices

Motivation

Dependencies

Simulated data

Running the benchmark

Dense matrix comparison

Big-memory comparison

Visualising the results

Exporting benchmark tables

Additional scripts

Try the bigPLScox package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

bigPLScox Partial Least Squares for Cox Models with Big Matrices

Benchmarking bigPLScox In bigPLScox: Partial Least Squares for Cox Models with Big Matrices

Motivation

Dependencies

Simulated data

Running the benchmark

Dense matrix comparison

Big-memory comparison

Visualising the results

Exporting benchmark tables

Additional scripts

Try the bigPLScox package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

bigPLScox
Partial Least Squares for Cox Models with Big Matrices

Benchmarking bigPLScox
In bigPLScox: Partial Least Squares for Cox Models with Big Matrices