Home

/

CRAN

/

coxstream

/

README.md
In coxstream: Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

coxstream

Memory-efficient Cox proportional hazards regression via streaming Newton-Raphson. Peak RAM is O(p^2) in the number of covariates and flat in the number of rows n, so models fit on datasets that do not fit in memory. Coefficients are identical to survival::coxph() with Efron tie correction.

coxstream (Python and R) holds peak RAM flat as the cohort grows, while in-memory solvers (lifelines, survival::coxph) scale with n; coefficients agree to machine precision.

Development version from GitHub:

# install.packages("remotes")
remotes::install_github("tommycarstensen/coxstream-r")

In-memory fit, with the same formula interface as coxph():

library(survival)
library(coxstream)

fit <- coxstream(Surv(time, status) ~ age + sex, data = lung)
coef(fit)
fit

Out-of-core fit, streaming a time-DESCENDING-sorted parquet file one row group at a time (requires the optional arrow package):

fit <- coxstream_arrow(
    "events_sorted.parquet",
    x_cols    = c("age", "sex"),
    time_col  = "duration",
    event_col = "event"
)
coef(fit)

The reader loads one row-group chunk at a time and frees it before the next, so peak RAM stays at O(batch_size * p), flat in n. Efron tie groups that span chunk boundaries are carried in running state, giving coefficients bit-identical to the in-memory fit.

Each Newton-Raphson iteration makes a single descending-time pass to accumulate the Cox partial-likelihood score and Hessian. Only running sums of size O(p) and O(p^2) are held, never the full risk set, so memory does not grow with n. The accumulation kernel is implemented in C++ via Rcpp.

Imports: Rcpp (compiled kernel), survival (Surv response; ships with R)
Suggests: arrow (only for coxstream_arrow()), testthat (tests)

MIT, except src/arrow_c_abi.h, which is vendored from Apache Arrow under Apache-2.0; see inst/COPYRIGHTS.

Any scripts or data that you put into this service are public.

coxstream documentation built on June 20, 2026, 5:07 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

coxstream
Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

README.md
In coxstream: Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

coxstream

Installation

Usage

How it works

Dependencies

License

Try the coxstream package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

coxstream Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

README.md In coxstream: Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

coxstream

Installation

Usage

How it works

Dependencies

License

Try the coxstream package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

coxstream
Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson

README.md
In coxstream: Memory-Efficient Cox Proportional Hazards via Streaming Newton-Raphson