survtable: Create Publication-Ready Survival Summary Tables
In summata: Publication-Ready Summary Tables and Forest Plots

survtable

R Documentation

Create Publication-Ready Survival Summary Tables

Description

Generates comprehensive survival summary tables with survival probabilities at specified time points, median survival times, and optional group comparisons with statistical testing. Designed for creating survival summaries commonly used in clinical and epidemiological research publications.

Usage

survtable(
  data,
  outcome,
  by = NULL,
  times = NULL,
  probs = 0.5,
  stats = c("survival", "ci"),
  type = "survival",
  conf_level = 0.95,
  conf_type = "log",
  digits = 0,
  time_digits = 1,
  p_digits = 3,
  percent = TRUE,
  test = TRUE,
  test_type = "logrank",
  total = TRUE,
  total_label = "Total",
  time_unit = NULL,
  time_label = NULL,
  median_label = NULL,
  labels = NULL,
  by_label = NULL,
  na_rm = TRUE,
  number_format = NULL,
  ...
)

Arguments

`data`	Data frame or data.table containing the survival dataset. Automatically converted to a data.table for efficient processing.
`outcome`	Character string or character vector specifying one or more survival outcomes using `Surv()` syntax (e.g., `"Surv(os_months, os_status)"`). When multiple outcomes are provided, results are stacked into a single table with outcome labels as row headers.
`by`	Character string specifying the column name of the stratifying variable for group comparisons (e.g., treatment arm, risk group). When `NULL` (default), produces overall survival summaries only.
`times`	Numeric vector of time points at which to estimate survival probabilities. For example, `c(12, 24, 36)` for 1-, 2-, and 3-year survival when time is measured in months. Default is `NULL`.
`probs`	Numeric vector of survival probabilities for which to estimate corresponding survival times (quantiles). Values must be between 0 and 1. For example, `c(0.5)` returns median survival time, `c(0.25, 0.5, 0.75)` returns quartiles. Default is `0.5` (median only).
`stats`	Character vector specifying which statistics to display: `"survival"` - Survival probability at specified times `"ci"` - Confidence interval for survival probability `"n_risk"` - Number at risk at each time point `"n_event"` - Cumulative number of events by each time point Default is `c("survival", "ci")`.
`type`	Character string specifying the type of probability to report: `"survival"` - Survival probability S(t) [default] `"risk"` - Cumulative incidence/risk 1 - S(t) `"cumhaz"` - Cumulative hazard -log(S(t))
`conf_level`	Numeric confidence level for confidence intervals. Must be between 0 and 1. Default is 0.95 (95% confidence intervals).
`conf_type`	Character string specifying the confidence interval type for survival estimates: `"log"` - Log transformation (default, recommended) `"log-log"` - Log-log transformation `"plain"` - Linear/identity (can produce CIs outside [0, 1]) `"logit"` - Logit transformation `"arcsin"` - Arcsin square root transformation
`digits`	Integer specifying the number of decimal places for survival probabilities (as percentages). Default is 0 (whole percentages).
`time_digits`	Integer specifying the number of decimal places for survival time estimates (median, quantiles). Default is 1.
`p_digits`	Integer specifying the number of decimal places for p-values. Values smaller than `10^(-p_digits)` are displayed as `"< 0.001"` (for `p_digits = 3`), `"< 0.0001"` (for `p_digits = 4`), etc. Default is 3.
`percent`	Logical. If `TRUE` (default), displays survival probabilities as percentages (e.g., `"85%"`). If `FALSE`, displays as proportions (e.g., `"0.85"`).
`test`	Logical. If `TRUE` (default), performs a survival curve comparison test and adds a p-value column. Requires `by` to be specified.
`test_type`	Character string specifying the statistical test for comparing survival curves: `"logrank"` - Log-rank test (default) `"wilcoxon"` - Wilcoxon (Breslow) test `"tarone"` - Tarone-Ware test `"petopeto"` - Peto-Peto test
`total`	Logical or character string controlling the total/overall column: `TRUE` or `"first"` - Include total column first [default] `"last"` - Include total column last (before p-value) `FALSE` - Exclude total column
`total_label`	Character string for the total/overall row label. Default is `"Total"`.
`time_unit`	Character string specifying the time unit for display in column headers and labels (e.g., `"months"`, `"days"`, `"years"`). When specified, time column headers become "{time} {time_unit}" (e.g., "12 months"). Default is `NULL` (no unit shown).
`time_label`	Character string template for time column headers when `times` is specified. Use `"\{time\}"` as placeholder for the time value and `"\{unit\}"` for the time unit. Default is `"\{time\} \{unit\}"` when `time_unit` is specified, otherwise just `"\{time\}"`.
`median_label`	Character string for the median survival row label. Default is `NULL`, which auto-constructs from `conf_level` (e.g., `"Median (95% CI)"` for `conf_level = 0.95`).
`labels`	Named character vector or list providing custom display labels. For stratified analyses, names should match levels of the `by` variable. For multiple outcomes, names should match the `Surv()` expressions. Default is `NULL`.
`by_label`	Character string providing a custom label for the stratifying variable (used in output attributes and headers). Default is `NULL` (uses variable name).
`na_rm`	Logical. If `TRUE` (default), observations with missing values in time, status, or the stratifying variable are excluded.
`number_format`	Character string or two-element character vector controlling thousand and decimal separators in formatted output. Named presets: `"us"` - Comma thousands, period decimal: `1,234.56` [default] `"eu"` - Period thousands, comma decimal: `1.234,56` `"space"` - Thin-space thousands, period decimal: `1 234.56` (SI/ISO 31-0) `"none"` - No thousands separator: `1234.56` Or provide a custom two-element vector `c(big.mark, decimal.mark)`, e.g., `c("'", ".")` for Swiss-style: `⁠1'234.56⁠`. When `NULL` (default), uses `getOption("summata.number_format", "us")`. Set the global option once per session to avoid passing this argument repeatedly: options(summata.number_format = "eu")
`...`	Additional arguments passed to `survfit`.

Details

Survival Probability Estimation:

Survival probabilities are estimated using the Kaplan-Meier method via survfit. At each specified time point, the function reports the estimated probability of surviving beyond that time.

Confidence Intervals:

The default "log" transformation for confidence intervals is recommended as it ensures intervals remain within [0, 1] and has good statistical properties. The "log-log" transformation is also commonly used and may perform better in the tails.

Statistical Testing:

The log-rank test (default) tests the null hypothesis that survival curves are identical across groups. Alternative tests weight different parts of the survival curve:

Log-rank: Equal weights (best for proportional hazards)
Wilcoxon: Weights by number at risk (sensitive to early differences)
Tarone-Ware: Weights by square root of number at risk
Peto-Peto: Modified Wilcoxon weights

Formatting:

All numeric output respects the number_format parameter. Separators within confidence intervals adapt automatically to avoid ambiguity:

Survival probabilities: "85% (80%-89%)" (US) or "85% (80%-89%)" (EU, en-dash separator)
Median survival: "24.5 (21.2-28.9)" (US) or "24,5 (21,2-28,9)" (EU)
Counts \ge 1000: "1,234" (US) or "1.234" (EU)
p-values: "< 0.001" (US) or "< 0,001" (EU)

Value

A data.table with S3 class "survtable" containing formatted survival statistics. The table structure depends on parameters:

When times is specified (survival at time points):

Variable/Group: Row identifier – stratifying variable levels
Time columns: Survival statistics at each requested time point
p-value: Test p-value (if test = TRUE and by specified)

When only probs is specified (survival quantiles):

Variable/Group: Row identifier – stratifying variable levels
Quantile columns: Time to reach each survival probability
p-value: Test p-value (if test = TRUE and by specified)

All numeric output (probabilities, times, counts, p-values) respects the number_format setting for locale-appropriate formatting.

The returned object includes the following attributes:

raw_data: Data.table with unformatted numeric values
survfit_objects: List of survfit objects for each stratum
by_variable: The stratifying variable name
times: The time points requested
probs: The probability quantiles requested
test_result: Full test result object (if test performed)

Examples

# Load example data
data(clintrial)

# Example 1: Survival at specific time points by treatment
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24, 36),
    time_unit = "months"
)



# Example 2: Median survival only
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = NULL,
    probs = 0.5
)

# Example 3: Multiple quantiles (quartiles)
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "stage",
    times = NULL,
    probs = c(0.25, 0.5, 0.75)
)

# Example 4: Both time points and median
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    probs = 0.5,
    time_unit = "months"
)

# Example 5: Cumulative incidence (1 - survival)
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    type = "risk"
)

# Example 6: Include number at risk
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    stats = c("survival", "ci", "n_risk")
)

# Example 7: Overall survival without stratification
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    times = c(12, 24, 36, 48)
)

# Example 8: Without total row
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    total = FALSE
)

# Example 9: Custom labels
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    labels = c("Drug A" = "Treatment A", "Drug B" = "Treatment B"),
    time_unit = "months"
)

# Example 10: Different confidence interval type
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    conf_type = "log-log"
)

# Example 11: Wilcoxon test instead of log-rank
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    test_type = "wilcoxon"
)

# Example 12: Access raw data for custom analysis
result <- survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24)
)
raw <- attr(result, "raw_data")
print(raw)

# Example 13: Access survfit objects for plotting
fits <- attr(result, "survfit_objects")
plot(fits$overall)  # Plot overall survival curve

# Example 14: Multiple survival outcomes stacked
survtable(
    data = clintrial,
    outcome = c("Surv(pfs_months, pfs_status)", "Surv(os_months, os_status)"),
    by = "treatment",
    times = c(12, 24),
    probs = 0.5,
    time_unit = "months",
    total = FALSE,
    labels = c(
        "Surv(pfs_months, pfs_status)" = "Progression-Free Survival",
        "Surv(os_months, os_status)" = "Overall Survival"
    )
)

# Example 15: European number formatting
survtable(
    data = clintrial,
    outcome = "Surv(os_months, os_status)",
    by = "treatment",
    times = c(12, 24),
    number_format = "eu"
)

summata documentation built on May 7, 2026, 5:07 p.m.