knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Most subsequent examples build on this simple rifttable:
library(rifttable) library(dplyr) # for data management, e.g., mutate() library(tibble) # for constructing a tibble, e.g. via tribble() data(breastcancer, package = "risks") design <- tribble( ~label, ~type, ~stratum, "**Overall**", "", "", " Deaths/N", "outcomes/total", c("Low", "High"), " Risk", "risk", c("Low", "High"), " Risk ratio (95% CI)", "rr", c("Low", "High"), " Risk difference (95% CI)", "rd", c("Low", "High"), "", "", "", "**Low hormone receptor**", "", "", " Deaths/N (Risk)", "outcomes/total (risk)", "Low", " Risk difference (95% CI)", "rd", "Low", "**High hormone receptor**", "", "", " Deaths/N (Risk)", "outcomes/total (risk)", "High", " Risk difference (95% CI)", "rd", "High" ) |> mutate( exposure = "stage", outcome = "death", effect_modifier = "receptor" ) rifttable( design = design, data = breastcancer ) |> rt_gt() # obtain formatted output
R's error messages can be frustrating. When using rifttable, these are the typical sources of errors:
type
) and the data. For example, type = "mean"
will not work on a categorical (factor) variable.To identify where an error is coming from, start simple. Comment out all but one line of the design
, putting #
at the beginning of the line. Start with a line that gives basic descriptive data, such as type = "total"
, type = "outcomes"
or type = "events/time"
, and re-run rifttable()
. Then add more lines with descriptive estimators, one by one. At the end, add lines that fit models, such as type = "hr"
.
design
?The design
that the rifttable()
function takes as input is simply a dataset that defines how the table should look like when rifttable()
has processed the data
.
The design
can be constructed in many different ways. All lead to the same table:
A dataset (tibble) defined using tribble()
r
design1 <- tribble(
~label, ~exposure, ~outcome, ~type,
"N", "stage", "death", "total",
"Deaths", "stage", "death", "outcomes"
)
design1
rifttable(
design = design1,
data = breastcancer
) |>
rt_gt()
A dataset (tibble) defined using tibble()
r
design2 <- tibble(
label = c("N", "Deaths"),
exposure = "stage",
outcome = "death",
type = c("total", "outcomes")
)
design2
rifttable(
design = design2,
data = breastcancer
) |>
rt_gt()
Concatenating tibbles, then editing with mutate()
r
design3 <- bind_rows(
tibble( # row 1
label = "N",
type = "total"
),
tibble( # row 2
label = "Deaths",
type = "outcomes"
)
) |>
mutate( # elements that are the same for all rows
exposure = "stage",
outcome = "death"
)
design3
rifttable(
design = design3,
data = breastcancer
) |>
rt_gt()
For descriptive tables: Use table1_design()
r
design4 <- breastcancer |>
table1_design(
death, # the total count will automatically be included
by = stage
)
design4
rifttable(
design = design4,
data = breastcancer
) |>
rt_gt()
See a separate overview about a descriptive Table 1.
External datasets
The design
could even be written out in an external dataset that can be
loaded with readr::read_csv()
(for CSV files) or readxl::read_excel()
(for Excel sheets).
rifttable tries to make as few assumptions as possible about how the user wants to treat missing data.
NA
) in the exposure are displayed as a separate exposure category for descriptive statistics (e.g., type = "total"
or type = "mean"
). They are omitted by comparative estimators (e.g., regression models). To change this behavior, call rifttable()
with the argument exposure_levels = "nona"
.--
or NA
), and regression models will use the non-missing observations. To exclude observations with missing outcome values altogether, add na_rm = TRUE
to (specific rows of) the design
.stratum
of the effect_modifier
. To include observations with the missing effect modifier, add NA
to the requested stratum in the design
, e.g., effect_modifier = "bmi", stratum = c("<25", NA)
.Use the overall
argument to show descriptive data for the entire data
set.
Inferential estimators showing comparisons between exposure categories will be blank there.
rifttable( design = design, data = breastcancer, overall = TRUE ) |> rt_gt() # obtain formatted output
Instead of testing a null hypothesis about a trend, rifttable proposes estimating the difference in the outcome for a one-unit higher exposure. This is also called a linear slope. Here, we estimate the risk associated for stage that is one category higher.
rifttable( design = design |> mutate(trend = "stage_numeric"), data = breastcancer |> mutate(stage_numeric = as.numeric(stage)) ) |> rt_gt() # obtain formatted output
Our simple toy dataset just has one exposure variable. For demonstration,
we just create a second variable, with two categories, "Level 1" and "Level 2,"
which is a simplified combination of the stage
and receptor
variables.
We will flip the table layout
from "rows"
(the default) to "cols"
and
concatenate two rifttables. We also need to give our new exposure2
variable
the same label as stage
to make sure results appear in
the same column.
breastcancer_2exposures <- breastcancer |> mutate( exposure2 = case_when( stage == "Stage I" | (stage == "Stage II" & receptor == "High") ~ "Level 1", stage == "Stage III" | (stage == "Stage II" & receptor == "Low") ~ "Level 2" ) ) attr(breastcancer_2exposures$exposure2, which = "label") <- "Exposure" attr(breastcancer_2exposures$stage, which = "label") <- "Exposure" bind_rows( design |> mutate(exposure = "exposure2") |> slice(2:5) |> rifttable( data = breastcancer_2exposures, layout = "cols" ), design |> slice(2:5) |> rifttable( data = breastcancer_2exposures, layout = "cols" ) ) |> rt_gt() # obtain formatted output
By default, difference measures are being rounded to 2 decimal digits (0.01
), such as type = "diff"
, the mean difference, or type = "quantreg"
, the median difference. The same goes for risk measures, such as type = "risk"
, unless shown as percentage points. Ratio measures are also shown with 2 decimal digits, such as type = "hr"
, the hazard ratio, or type = "fold"
, a ratio of arithmetic means.
Rounding can be changed by setting the rifttable()
arguments diff_digits
, risk_digits
, and ratio_digits
globally for the entire table.
design <- tribble( ~label, ~type, "Deaths/N", "outcomes/total", "Risk", "risk", "Risk ratio (95% CI)", "rr", "Odds ratio (95% CI)", "or", "Risk difference (95% CI)", "rd" ) |> mutate( exposure = "stage", outcome = "death" ) rifttable( design = design, data = breastcancer, ratio_digits = 3, # Many digits for ratios risk_digits = 1 ) |> # Fewer digits for risks rt_gt() # obtain formatted output
As can be seen, ratios > 3 are still shown with 1 fewer decimal, and ratios > 10 are shown with 2 fewer decimals (Wilcox, Epidemiology 2004 motivates why). To disable such additional rounding of extremely high ratios:
rifttable( design = design, data = breastcancer, ratio_digits = 3, ratio_digits_decrease = NULL, # Do not round high ratios more risk_digits = 1 ) |> rt_gt() # obtain formatted output
Additionally, rounding can be changed for each row, adding a column digits
to the rifttable design
:
tribble( ~label, ~type, ~digits, "Deaths/N", "outcomes/total", NA, # Uses rifttable default "Risk", "risk", NA, # Uses risk_digits below "Risk ratio (95% CI)", "", NA, " Rounded to 1 digit", "rr", 1, " Rounded to 2 digits", "rr", 2, "Risk difference (95% CI)", "rd", 3 ) |> # Overrides risk_digits mutate( exposure = "stage", outcome = "death" ) |> rifttable( data = breastcancer, risk_digits = 1 ) |> # Fewer digits for risks, unless specified by "digits" rt_gt() # obtain formatted output
By default, regression models will be fit separately for each stratum of the effect_modifier
.
Append "_joint"
to "hr"
, "rr"
, "rd"
, "irr"
, "irrrob"
, "diff"
, "fold"
, "foldlog"
, "quantreg"
, or "or"
to obtain "joint" models for exposure and effect modifier that have a single reference category.
Note that the joint model will be fit across all non-missing (NA
) strata of the effect modifier, even if the design
table does not request all strata be shown.
Compare stratified models to joint models for risk differences (for simplicity of presentation, count data are omitted):
tribble( ~label, ~type, ~stratum, "**Overall**", "rd", c("Low", "High"), "", "", "", "**Stratified models**", "", "", " Low hormone receptor", "rd", "Low", " High hormone receptor", "rd", "High", "", "", "", "**Joint models**", "", "", " Low hormone receptor", "rd_joint", "Low", " High hormone receptor", "rd_joint", "High" ) |> mutate( exposure = "stage", outcome = "death", effect_modifier = "receptor" ) |> rifttable(data = breastcancer) |> rt_gt()
The reference categories for exposure and effect modifier are always their first factor levels. Compare the preceding example: "High"
is, alphabetically, before "Low"
. To change the reference category, use forcats::fct_relevel()
or the base R alternative relevel()
on variables in the data
provided to rifttable()
:
tribble( ~label, ~type, ~stratum, "**Joint models**", "", "", " Low hormone receptor", "rd_joint", "Low", " High hormone receptor", "rd_joint", "High" ) |> mutate( exposure = "stage", outcome = "death", effect_modifier = "receptor" ) |> rifttable( data = breastcancer |> mutate( receptor = relevel( factor(receptor), # Make "receptor" a factor in the first place ref = "Low" ) ) ) |> # Set new reference category rt_gt()
If a middle category of the exposure stage
is desired as the reference:
result_reordered <- tibble( label = "**RD (95% CI)**", type = "rd", exposure = "stage", outcome = "death" ) |> rifttable( data = breastcancer |> mutate( stage = relevel( stage, ref = "Stage II" ) ) ) result_reordered |> rt_gt()
Using forcats::fct_relevel()
may be preferable over relevel()
, as it preserves the variable label: here, the variable stage
lost its label, "Stage"
starting with upper case S.
That results for "Stage II" are now listed first is probably undesirable. Reorder the columns of the table that rifttable()
produced to print results for "Stage I" first:
result_reordered |> select(stage, "Stage I", everything()) |> rt_gt()
Add a ci
column to the design
:
tribble( ~label, ~type, ~ci, "Deaths/N (Risk)", "outcomes/total (risk)", NA, "Risk ratio", "", NA, " 80% CI", "rr", 0.8, " 95% CI", "rr", NA, # Defaults to 0.95 " 99% CI", "rr", 0.99 ) |> mutate( exposure = "stage", outcome = "death" ) |> rifttable( data = breastcancer, risk_percent = TRUE ) |> rt_gt() # obtain formatted output
While the package provides a number of estimators commonly used in epidemiology, it will never be able to include all possible estimators. However, any custom estimate can be integrated into a rifttable by a defining custom estimation function.
The subsequent example will reproduce the following basic rifttable, which shows the mean age by sex, stratified by ECOG performance status, in the cancer
data set:
data(cancer, package = "survival") cancer <- cancer |> tibble::as_tibble() |> mutate( sex = factor( sex, levels = 1:2, labels = c("Male", "Female") ) ) design <- tibble::tibble( type = "mean", exposure = "sex", outcome = "age", effect_modifier = "ph.ecog", stratum = 1:2, label = paste0("ECOG PS ", stratum, ": mean age") ) design |> rifttable( data = cancer, overall = TRUE ) |> rt_gt()
Instead of relying on rifttable's built-in estimator type = "mean"
, we will define a custom function that calculates the mean:
estimate_my_mean <- function(data, ...) { data |> group_by(.exposure) |> summarize( res = paste( round( mean(.outcome), digits = 3 ), "yrs" ) ) }
Use the custom function my_mean
instead of the built-in mean
:
design |> # Edit the previous design mutate( type = "my_mean", # Replace built-in "mean" by custom "my_mean" label = paste0(label, " (custom)") ) |> rifttable( data = cancer, overall = TRUE ) |> rt_gt()
Specifications for custom functions:
estimate_
; this prefix is to be omitted when later calling the custom function within rifttable()
.data
provided to rifttable()
will be available to the custom function as an argument named data
.data
are already subsetted to the stratum
of the effect_modifier
, if applicable..exposure
, .outcome
, .event
, .time
, and .time2
, as applicable.design
(e.g., confounders
, digits
, na_rm
, etc.) and of the rifttable()
function (e.g., reference
, risk_percent
) as arguments. Many may not be relevant and can be captured in the argument list, ...
(see example)..exposure
, with one row per exposure category, and one string column res
for the estimate.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.