knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, eval = TRUE )
Execution speed is rarely the primary consideration when selecting statistical software—correctness, interpretability, and ease-of-use usually take precedence. However, computational efficiency becomes relevant when working with large datasets, conducting simulation studies, or iterating through model specifications during exploratory analysis.
This article documents the computational performance of summata relative to established alternatives. The benchmarks presented here are intended as a reference for users whose workflows involve performance-sensitive operations, and as a record of the design tradeoffs inherent in different implementation approaches.
All benchmarks were conducted using the microbenchmark package under the following conditions:
Datasets were generated using a fixed random seed to ensure reproducibility. Timing measurements exclude package loading and data generation. All packages were tested using default parameters unless otherwise noted.
Two summata configurations are benchmarked throughout: the default configuration, which uses profile likelihood confidence intervals for GLM models and includes full formatting (QC statistics, sample sizes, reference rows); and a minimal configuration (summata_minimal), which uses Wald CIs and disables optional output features. This distinction is important because profile likelihood CIs dominate GLM execution time, and the minimal configuration provides a way to measure summata's formatting overhead in isolation. Note that finalfit and broom also use profile likelihood CIs by default for GLM models.
library(summata) library(microbenchmark) library(ggplot2)
Descriptive summary tables represent a common first step in data analysis. The following packages provide comparable functionality with differing implementation strategies.
| Package | Function | Implementation Notes |
|:--------|:---------|:---------------------|
| summata | desctable() | data.table operations |
| arsenal | tableby() | Formula-based interface |
| tableone | CreateTableOne() | Matrix-based computation |
| finalfit | summary_factorlist() | tidyverse ecosystem |
| gtsummary | tbl_summary() | gt table framework |
knitr::include_graphics("figures/benchmark_desctable.png")
| Dataset Size | summata | arsenal | tableone | finalfit | gtsummary |
|:-------------|--------:|--------:|---------:|---------:|----------:|
| n = 1,000 | 42 ms | 64 ms | 46 ms | 429 ms | 2,901 ms |
| n = 5,000 | 57 ms | 77 ms | 79 ms | 442 ms | 2,929 ms |
| n = 10,000 | 73 ms | 98 ms | 126 ms | 464 ms | 3,001 ms |
The observed timing differences reflect underlying implementation choices. Packages built on data.table or base R matrix operations (summata, tableone, arsenal) exhibit lower overhead than those employing more extensive formatting pipelines (gtsummary). The gtsummary package prioritizes output flexibility and gt integration, which introduces additional computational cost.
Survival probability tables summarize Kaplan-Meier estimates at specified time points.
| Package | Function | Notes |
|:--------|:---------|:------|
| summata | survtable() | Formatted output |
| manual | survival::survfit() | Raw computation |
| gtsummary | tbl_survfit() | gt integration |
knitr::include_graphics("figures/benchmark_survtable.png")
| Dataset Size | summata | gtsummary | manual |
|:-------------|--------:|----------:|-------:|
| n = 1,000 | 21 ms | 266 ms | 6 ms |
| n = 5,000 | 35 ms | 271 ms | 11 ms |
| n = 10,000 | 52 ms | 274 ms | 14 ms |
Direct survfit() computation provides a baseline for the minimum time required. The difference between raw computation and formatted output reflects the cost of table construction and presentation logic.
The following benchmarks compare functions that extract and format regression coefficients. Each package produces tables suitable for publication, though with varying levels of default formatting. Compared functions are as follows:
| Package | Function | Notes |
|:--------|:---------|:------|
| summata | fit() | Profile likelihood CIs, QC stats, counts, and reference rows |
| summata_minimal | fit(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE, keep_qc_stats = FALSE) | Wald CIs, reduced output |
| finalfit | glmuni() + fit2df() | Profile likelihood CIs (default) |
| broom | tidy() | Profile likelihood CIs via confint() dispatch |
| gtsummary | tbl_regression() | gt formatting |
knitr::include_graphics("figures/benchmark_logistic.png")
| Dataset Size | summata_minimal | summata | finalfit | broom | gtsummary |
|:-------------|----------------:|--------:|---------:|------:|----------:|
| n = 500 | 18 ms | 174 ms | 147 ms | 150 ms | 1,344 ms |
| n = 1,000 | 22 ms | 234 ms | 212 ms | 214 ms | 1,399 ms |
| n = 5,000 | 37 ms | 840 ms | 749 ms | 936 ms | 2,153 ms |
| n = 10,000 | 45 ms | 1,532 ms | 1,562 ms | 1,564 ms | 2,756 ms |
The default summata configuration uses profile likelihood confidence intervals for GLM models, as do finalfit and broom::tidy(). The three packages show comparable performance for logistic regression because profile likelihood profiling dominates execution time for all of them. The summata_minimal configuration uses Wald CIs instead, skipping the profiling step entirely, and achieves the fastest extraction times at all sample sizes. At large n, profiling cost grows with the number of IRLS iterations, causing all profile-based packages to converge toward similar timings.
knitr::include_graphics("figures/benchmark_linear.png")
| Dataset Size | summata_minimal | summata | finalfit | broom | gtsummary |
|:-------------|----------------:|--------:|---------:|------:|----------:|
| n = 500 | 20 ms | 35 ms | 6 ms | 6 ms | 1,179 ms |
| n = 1,000 | 21 ms | 36 ms | 7 ms | 6 ms | 1,193 ms |
| n = 5,000 | 27 ms | 43 ms | 9 ms | 9 ms | 1,192 ms |
| n = 10,000 | 38 ms | 49 ms | 13 ms | 12 ms | 1,230 ms |
For linear models, broom::tidy() and finalfit achieve faster coefficient extraction due to lower formatting overhead. All three packages use exact t-distribution CIs for lm objects (via confint.lm()), so the timing difference reflects formatting features (reference rows, QC statistics) rather than CI computation.
knitr::include_graphics("figures/benchmark_poisson.png")
| Dataset Size | summata_minimal | summata | finalfit | broom | gtsummary |
|:-------------|----------------:|--------:|---------:|------:|----------:|
| n = 500 | 20 ms | 155 ms | 135 ms | 144 ms | 1,293 ms |
| n = 1,000 | 24 ms | 201 ms | 181 ms | 184 ms | 1,325 ms |
| n = 5,000 | 34 ms | 595 ms | 613 ms | 612 ms | 1,868 ms |
| n = 10,000 | 47 ms | 1,351 ms | 1,409 ms | 1,402 ms | 2,577 ms |
Poisson regression shows the same profile likelihood pattern as logistic regression: the default summata, finalfit, and broom all use profile CIs and show comparable performance. The summata_minimal configuration with Wald CIs is consistently the fastest option.
knitr::include_graphics("figures/benchmark_cox.png")
| Dataset Size | summata_minimal | summata | finalfit | broom | gtsummary | |:-------------|----------------:|--------:|---------:|------:|----------:| | n = 500 | 17 ms | 34 ms | 7 ms | 12 ms | 1,149 ms | | n = 1,000 | 21 ms | 38 ms | 9 ms | 14 ms | 1,161 ms | | n = 5,000 | 40 ms | 61 ms | 25 ms | 30 ms | 1,227 ms |
Cox models use Wald CIs regardless of the conf_method setting (the standard approach in survival analysis), so the timing difference between summata and summata_minimal reflects formatting overhead only. finalfit and broom achieve faster extraction with less formatting.
Mixed-effects models present a useful comparison case because the underlying model fitting (via lme4) dominates execution time regardless of the wrapper package.
| Package | Function | Notes |
|:--------|:---------|:------|
| summata | fit(..., model_type = "lmer") | Unified interface |
| summata_minimal | fit(..., model_type = "lmer", conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE, keep_qc_stats = FALSE) | Reduced output |
| finalfit | lmmixed() + fit2df() | Two-step process |
| broom.mixed | tidy() | Minimal extraction |
| gtsummary | tbl_regression() | gt formatting |
knitr::include_graphics("figures/benchmark_mixed.png")
| Dataset Size | summata_minimal | summata | finalfit | broom | gtsummary |
|:-------------|----------------:|--------:|---------:|------:|----------:|
| n = 500 | 35 ms | 58 ms | 26 ms | 31 ms | 1,141 ms |
| n = 1,000 | 37 ms | 60 ms | 28 ms | 34 ms | 1,133 ms |
| n = 5,000 | 52 ms | 76 ms | 43 ms | 47 ms | 1,185 ms |
The relatively narrow spread among summata, finalfit, and broom reflects the dominance of model fitting time. Differences in wrapper overhead become proportionally less significant as the underlying computation grows.
Univariable screening—fitting separate models for each predictor—provides a test case for operations involving many repeated model fits.
| Package | Function | Notes |
|:--------|:---------|:------|
| summata | uniscreen() | Parallel-capable |
| summata_minimal | uniscreen(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE) | Wald CIs, reduced output |
| finalfit | glmuni() + fit2df() | Sequential |
| broom | Loop + tidy() | Manual implementation |
| arsenal | modelsum() | Formula interface |
| gtsummary | tbl_uvregression() | gt formatting |
knitr::include_graphics("figures/benchmark_uniscreen.png")
Screening 14 predictors:
| Dataset Size | summata_minimal | summata | finalfit | broom | arsenal | gtsummary |
|:-------------|----------------:|--------:|---------:|------:|--------:|----------:|
| n = 500 | 117 ms | 319 ms | 360 ms | 440 ms | 877 ms | 12,963 ms |
| n = 1,000 | 134 ms | 351 ms | 477 ms | 558 ms | 1,128 ms | 13,025 ms |
| n = 5,000 | 196 ms | 624 ms | 1,763 ms | 1,799 ms | 3,401 ms | 14,001 ms |
The performance gap between summata (default) and summata_minimal is amplified during univariable screening because profile likelihood profiling is repeated for each of the 14 predictor models. All profile-based packages (summata default, finalfit, broom) show comparable performance, as profiling dominates their execution time. With Wald CIs, summata_minimal is the fastest option at all sample sizes, outperforming the next-fastest alternative by 2.6–9.0× due to data.table vectorization and parallel model fitting.
The combined univariable screening and multivariable modeling workflow represents a common analytical pattern in statistical research.
| Package | Approach | Notes |
|:--------|:---------|:------|
| summata | fullfit() | Single function |
| summata_minimal | fullfit(..., conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE) | Wald CIs, reduced output |
| finalfit | finalfit() | Single function |
| manual | Loop + glm() + broom::tidy() + rbind() | Custom |
| gtsummary | tbl_uvregression() + tbl_regression() + tbl_merge() | Multi-step |
knitr::include_graphics("figures/benchmark_workflow.png")
| Dataset Size | summata_minimal | summata | finalfit | manual | gtsummary |
|:-------------|----------------:|--------:|---------:|-------:|----------:|
| n = 500 | 123 ms | 450 ms | 207 ms | 407 ms | 9,655 ms |
| n = 1,000 | 136 ms | 549 ms | 200 ms | 541 ms | 9,726 ms |
| n = 5,000 | 196 ms | 1,479 ms | 209 ms | 1,889 ms | 11,092 ms |
The default summata and finalfit show comparable performance for GLM workflows because both use profile likelihood CIs. The difference between them reflects summata's additional features (QC statistics, reference rows, complete-case sample sizes) versus finalfit's inclusion of a descriptive statistics table. The summata_minimal configuration with Wald CIs is the fastest single-function option at small to moderate sample sizes, completing the combined analysis in roughly 60–70% of the time finalfit requires at n = 500–1,000.
Forest plot generation combines data extraction with graphical rendering.
| Package | Function | Notes |
|:--------|:---------|:------|
| summata | coxforest() | Integrated table and plot |
| survminer | ggforest() | Survival-focused |
| manual | Custom ggplot2 | Maximum flexibility |
knitr::include_graphics("figures/benchmark_forest.png")
| Dataset Size | summata | survminer | manual |
|:-------------|--------:|----------:|-------:|
| n = 500 | 203 ms | 345 ms | 57 ms |
| n = 1,000 | 198 ms | 338 ms | 54 ms |
| n = 5,000 | 203 ms | 329 ms | 53 ms |
The manual approach produces only the graphical element, while summata and survminer generate integrated displays with coefficient tables. The relatively constant timing across dataset sizes indicates that plot rendering, rather than data processing, dominates execution time. Also, there are significant cosmetic differences between the three graphical outputs, which predominates other factors when selecting a plotting function.
The following figures summarize timing ratios across benchmarks. Values greater than 1 indicate the comparison package requires more time than the baseline.
summata (default, profile likelihood CIs)knitr::include_graphics("figures/benchmark_speedup.png")
summata_minimal (Wald CIs, no QC stats)knitr::include_graphics("figures/benchmark_speedup_minimal.png")
Ratios relative to summata (default):
| Benchmark | gtsummary | finalfit | arsenal |
|:----------|----------:|---------:|--------:|
| Descriptive Tables | 41–70× | 6–10× | 1.4–1.5× |
| Survival Tables | 5–12× | — | — |
| Logistic Regression | 6–8× | 0.8–1.0× | — |
| Poisson Regression | 7–8× | 0.9–1.0× | — |
| Linear Regression | 25–34× | 0.2–0.3× | — |
| Cox Regression | 20–34× | 0.2–0.4× | — |
| Mixed-Effects | 16–20× | 0.5–0.6× | — |
| Univariable Screening | 22–41× | 1.1–2.8× | 2.8–5.5× |
| Complete Workflow | 8–21× | 0.1–0.5× | — |
Ratios relative to summata_minimal (Wald CIs):
| Benchmark | gtsummary | finalfit | summata (default) |
|:----------|----------:|---------:|--------:|
| Logistic Regression | 58–74× | 8–35× | 10–34× |
| Poisson Regression | 55–63× | 7–30× | 8–29× |
| Linear Regression | 32–59× | 0.3× | 1.3–1.7× |
| Cox Regression | 31–68× | 0.4–0.6× | 1.5–2.0× |
| Mixed-Effects | 23–33× | 0.8–0.9× | 1.5–1.7× |
| Univariable Screening | 71–111× | 3.1–9.0× | 2.6–3.2× |
| Complete Workflow | 57–78× | 1.1–1.7× | 3.7–7.5× |
For GLM models (logistic and Poisson), summata_minimal outperforms all alternatives by a wide margin: 8–35× faster than finalfit, 7–30× faster than broom. This is because summata_minimal is the only configuration that uses Wald CIs — finalfit, broom, and the default summata all use profile likelihood CIs, which accounts for their comparable timings.
For linear, Cox, and mixed-effects models, where all packages use the same CI method (exact t-distribution for lm, Wald for Cox and mixed-effects), the timing gap between summata and summata_minimal is narrow (1.3–2.0×) and reflects formatting overhead only.
The relationship between dataset size and execution time provides insight into algorithmic complexity. Near-linear scaling (execution time proportional to n) indicates efficient implementation, while superlinear scaling may suggest operations with O(n²) complexity, such as repeated rbind() calls or element-wise data frame construction.
Observed scaling factors for summata (ratio of time at n = 10,000 to time at n = 1,000):
| Operation | Scaling Factor | Expected for O(n) | |:----------|---------------:|----------------------:| | Descriptive tables | 1.7× | 10× | | Logistic regression | 6.5× | 10× | | Univariable screening | 1.8× | 10× |
The sublinear scaling reflects that fixed overhead (package loading, object construction, profile likelihood profiling) constitutes a significant fraction of total time at smaller dataset sizes. Logistic regression shows nearer-to-linear scaling because profile likelihood profiling cost scales with the number of IRLS iterations, which grows with n.
The performance characteristics documented here reflect specific implementation choices:
summata: Built on data.table for data manipulation, with coefficient extraction optimized for common model classes. Default configuration uses profile likelihood CIs for GLM/negbin models (matching finalfit and broom). The conf_method = "wald" option skips profiling entirely, producing a configuration faster than any alternative tested.
gtsummary: Prioritizes output flexibility through the gt table framework. The additional abstraction layers enable extensive customization but increase computational overhead.
finalfit: Balances functionality and performance with a tidyverse-compatible interface. Uses profile likelihood CIs by default for GLM models (confint_type = "profile"). The finalfit() function is particularly optimized for the combined workflow.
arsenal: Uses formula-based syntax familiar to SAS users. Performance varies by operation type.
broom: Provides minimal coefficient extraction with limited formatting. Uses profile likelihood CIs for GLM models via stats::confint() dispatch. Suitable as a building block for custom pipelines.
By default, summata regression functions compute profile likelihood confidence intervals (for GLM and negative binomial models), sample sizes, event counts, QC statistics, and reference rows for categorical variables. These features produce more complete and accurate output for publication but add computational overhead. For performance-sensitive applications, these options can be disabled.
The summata_minimal configuration shown in the benchmarks represents:
fit(data, outcome, predictors, conf_method = "wald", show_n = FALSE, show_events = FALSE, reference_rows = FALSE, keep_qc_stats = FALSE)
The conf_method parameter can also be set globally for an entire session:
options(summata.conf_method = "wald")
The impact of each option varies by model type:
| Option | GLM/negbin models | Linear/Cox/mixed models |
|:-------|:-----------------|:-----------------------|
| conf_method = "wald" | Large effect (eliminates profile likelihood profiling) | Minimal effect (Wald already used for Cox/mixed; exact t is fast for lm) |
| keep_qc_stats = FALSE | Moderate effect (skips C-statistic, Hosmer-Lemeshow) | Small effect |
| show_n/show_events = FALSE | Small effect | Small effect |
| reference_rows = FALSE | Small effect | Small effect |
For logistic and Poisson models at n = 1,000, the minimal configuration is approximately 10× faster than the default (22 ms vs. 234 ms for logistic), with the majority of the difference attributable to conf_method. For linear and Cox models, the difference is roughly 1.5–2×, reflecting formatting overhead only.
The choice between configurations depends on the use case:
conf_method = "wald" reduces per-iteration overhead substantially for GLM modelsconf_method = "wald" is recommended when iterating through many model specificationsThe timing differences documented here range from negligible (tens of milliseconds) to substantial (several seconds). The practical significance depends on context:
Package selection should primarily reflect functional requirements, syntax preferences, and ecosystem compatibility. Performance considerations become relevant only when computational constraints are binding.
The benchmark script is available in the package repository at inst/benchmarks/benchmarks.R. Execution produces:
benchmark_speedup.png, benchmark_speedup_minimal.png)Results will vary across systems due to differences in hardware, R version, and package versions.
This benchmark was run under the following conditions:
R version 4.5.2 (2025-10-31) Platform: x86_64-unknown-linux-gnu Matrix products: default BLAS/LAPACK: /usr/lib/libopenblasp-r0.3.30.so; LAPACK version 3.12.0 Void Linux x86_64 Linux 6.12.63_1 Intel(R) Core(TM) i5-4670K (4) @ 3.80 GHz NVIDIA GeForce GTX 970 [Discrete]
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.