knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Implemented model types

The risks package implements all major regression models that have been proposed for relative risks and risk differences. By default (approach = "auto"), riskratio() and riskdiff() estimate the most efficient valid model that converges; in more numerically challenging cases, they default to marginal standardization, which does not return parameters for covariates.

The following models are implemented in the risks package:

^1^ | approach =| RR | RD | Model | Reference

--|-------------|-----|-----|-----------------------------|---------------------------------- 1 | glm | riskratio | riskdiff | Binomial model with a log or identity link | Wacholder S. Binomial regression in GLIM: Estimating risk ratios and risk differences. Am J Epidemiol 1986;123:174-184. 2 | glm_startp | riskratio | riskdiff | Binomial model with a log or identity link, convergence-assisted by starting values from Poisson model | Spiegelman D, Hertzmark E. Easy SAS calculations for risk or prevalence ratios and differences. Am J Epidemiol 2005;162:199-200. 3 | margstd_delta | riskratio | riskdiff | Marginally standardized estimates using binomial models with a logit link (logistic model) with standard errors calculated via the delta method. | This package. 4 | margstd_boot | riskratio | riskdiff | Marginally standardized estimates using binomial models with a logit link (logistic model) with bias-corrected accelerated (BC~a~) confidence intervals from parametric bootstrapping (see Marginal standardization). | This package. \ For marginal standardization with nonparametric bootstrapping, see: Localio AR, Margolis DJ, Berlin JA. Relative risks and confidence intervals were easily computed indirectly from multivariable logistic regression. J Clin Epidemiol 2007;60(9):874-82. -- | glm_cem | riskratio | --- | Binomial model with log-link fitted via combinatorial expectation maximization instead of Fisher scoring | Donoghoe MW, Marschner IC. logbin: An R Package for Relative Risk Regression Using the Log-Binomial Model. J Stat Softw 2018;86(9). -- | glm_cem | --- | riskdiff | Additive binomial model (identity link) fitted via combinatorial expectation maximization instead of Fisher scoring | Donoghoe MW, Marschner IC. Stable computational methods for additive binomial models with application to adjusted risk differences. Comput Stat Data Anal 2014;80:184-96. --| robpoisson | riskratio | riskdiff | Log-linear (Poisson) model with robust/sandwich/empirical standard errors | Zou G. A modified Poisson regression approach to prospective studies with binary data. Am J Epidemiol 2004;159(7):702-6 --| duplicate | riskratio | -- | Case-duplication approach, fitting a logistic model with cluster-robust standard errors | Schouten EG, Dekker JM, Kok FJ, Le Cessie S, Van Houwelingen HC, Pool J, Vandenbroucke JP. Risk ratio and rate ratio estimation in case-cohort designs: hypertension and cardiovascular mortality. Stat Med 1993;12:1733–45. --| glm_startd | riskratio | --- | Binomial model with a log link, convergence-assisted by starting values from case-duplication logistic model | This package. --| logistic | riskratio, for comparison only | --- | Binomial model with logit link (i.e., the logistic model), returning odds ratios | Included for comparison purposes only. ^1^ Indicates the priority with which the legacy modelling strategy (approach = "legacy") attempts model fitting (glm_startp: only for RR).

Which model was fitted is always indicated in the first line of the output of summary() and in the model column of tidy(). In methods sections of manuscripts, the approach can be described in detail as follows: "Risk ratios (or risk differences) were obtained via (method listed in the first line of model summary.risks(...)) using the risks R package (reference to this package and/or the article listed in the column 'reference')."

For example: "Risk ratios were obtained from binomial models with a log link, convergence-assisted by Poisson models (ref. Spiegelman and Hertzmark, AJE 2005), using the risks R package (https://stopsack.github.io/risks/)."

Model choice

By default, automatic model fitting (approach = "auto") reports results from marginal standardization using a logistic model with delta method standard errors (equivalent to approach = "margstd_delta"). An exception is made if interaction terms between exposure and confounders are included. This case, confidence intervals are calculated using bootstrapping (equivalent to requesting approach = "margstd_boot"). Alternatively, any of the options listed under approach = in the table can be requested directly. However, unlike with approach = "auto" (the default) or approach = "legacy", the selected model may not converge.

We load the same example data as in the Get Started vignette.

library(risks)  # provides riskratio(), riskdiff(), postestimation functions
library(dplyr)  # For data handling
library(broom)  # For tidy() model summaries
data(breastcancer)

We then select a binomial model with starting values from the Poisson model:

riskratio(formula = death ~ stage + receptor, 
          data = breastcancer, 
          approach = "glm_startp")

\ However, the binomial model without starting values (approach = "glm") does not converge, as expected.

Model comparisons

With approach = "all", all model types listed in the tables are fitted. The fitted object, e.g., fit, is one of the converged models. A summary of the convergence status of all models is displayed at the beginning of summary(fit):

fit_all <- riskratio(formula = death ~ stage + receptor, 
                     data = breastcancer, 
                     approach = "all")
summary(fit_all)

\ Individual models can be accessed as fit$all_models[[1]] through fit$all_models[[6]] (or [[7]] if fitting a risk ratio model). tidy() shows coefficients and confidence intervals from all models that converged:

tidy(fit_all) %>%
  select(-statistic, -p.value) %>%
  print(n = 50)


stopsack/risks documentation built on April 5, 2025, 6:02 p.m.