knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Installation

The risks package can be installed from CRAN:

install.packages("risks")

Development versions can be installed from GitHub:

remotes::install_github("stopsack/risks")

An example cohort study

We use a cohort of women with breast cancer, as used by Spiegelman and Hertzmark (Am J Epidemiol 2005) and Greenland (Am J Epidemiol 2004). The the categorical exposure is stage, the binary outcome is death, and the binary confounder is receptor.

library(risks)  # provides riskratio(), riskdiff(), postestimation functions
library(dplyr)  # For data handling
library(broom)  # For tidy() model summaries

data(breastcancer)  # Load sample data

breastcancer %>%  # Display the sample data
  group_by(receptor, stage) %>% 
  summarize(
    deaths = sum(death), 
    total = n(), 
    risk = deaths / total
  )

Adjusted risk ratios

The risk of death is higher among women with higher-stage and hormone receptor-low cancers, which also tend to be of higher stage. Using risks models to obtain (possibly multivariable-adjusted) risk ratios or risk differences is similar to the standard code for logistic models in R. As customary in R, log(RR) is returned; see below for how to obtain exponentiated values. No options for model family or link need to be supplied:

fit_rr <- riskratio(
  formula = death ~ stage + receptor, 
  data = breastcancer
)
summary(fit_rr)

Adjusted risk differences

To obtain risk differences, use riskdiff, which has the same syntax:

fit_rd <- riskdiff(
  formula = death ~ stage + receptor, 
  data = breastcancer
)
summary(fit_rd)

For example, the risk of death was 57 percentage points higher in women with stage III breast cancer compared to stage I (95% confidence interval, 38 to 77 percentage points), adjusting for hormone receptor status.

The model summary in riskratio() and riskdiff() includes to two additions compared to a regular glm() model:

Accessing model coefficients

The risks package provides an interface to broom::tidy(), which returns a data frame of all coefficients (risk differences in this example), their standard errors, z statistics, and confidence intervals.

tidy(fit_rd)

\ In accordance with glm() standards, coefficients for relative risks are shown on the logarithmic scale. Exponentiated coefficients for risk ratios are easily obtained via tidy(..., exponentiate = TRUE):

tidy(
  fit_rr, 
  exponentiate = TRUE
)

For example, the risk of death was 6 times higher in women with stage III breast cancer compared to stage I (95% confidence interval, 3 to 13 times), adjusting for hormone receptor status.

More post-estimation commands

Typical R functions that build on regression models can further process fitted risks models. In addition to tidy(), examples are:



stopsack/risks documentation built on April 5, 2025, 6:02 p.m.