calc_risk_diff: Calculate Risk Differences with Robust Model Fitting and...
In riskdiff: Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

calc_risk_diff

R Documentation

Calculate Risk Differences with Robust Model Fitting and Boundary Detection

Description

Calculates risk differences (or prevalence differences for cross-sectional data) using generalized linear models with identity, log, or logit links. Version 0.2.1 includes enhanced boundary detection, robust confidence intervals, and improved data quality validation to prevent extreme confidence intervals in stratified analyses.

The function addresses common convergence issues with identity link binomial GLMs by implementing a fallback strategy across multiple link functions, similar to approaches described in Donoghoe & Marschner (2018) for relative risk regression.

Usage

calc_risk_diff(
  data,
  outcome,
  exposure,
  adjust_vars = NULL,
  strata = NULL,
  link = "auto",
  alpha = 0.05,
  boundary_method = "auto",
  verbose = FALSE
)

Arguments

`data`	A data frame containing all necessary variables
`outcome`	Character string naming the binary outcome variable (must be 0/1 or logical)
`exposure`	Character string naming the exposure variable of interest
`adjust_vars`	Character vector of variables to adjust for (default: NULL)
`strata`	Character vector of stratification variables (default: NULL)
`link`	Character string specifying link function: "auto", "identity", "log", or "logit" (default: "auto")
`alpha`	Significance level for confidence intervals (default: 0.05)
`boundary_method`	Method for handling boundary cases: "auto", "profile", "bootstrap", "wald" (default: "auto")
`verbose`	Logical indicating whether to print diagnostic messages (default: FALSE)

Details

New in Version 0.2.1: Enhanced Stability and Quality Validation

This version adds comprehensive data quality validation to prevent the extreme confidence intervals that could occur in stratified analyses:

Enhanced Data Validation:

Pre-analysis checks for stratification feasibility
Detection of small sample sizes within strata
Identification of rare outcomes or unbalanced exposures
Warning for potential separation issues

Boundary Detection and Robust Inference:

When the MLE is on the boundary, standard asymptotic theory may not apply. The function detects and handles:

upper_bound: Fitted probabilities approaching 1
lower_bound: Fitted probabilities approaching 0
separation: Complete or quasi-perfect separation
both_bounds: Mixed boundary issues

Robust Confidence Intervals:

For boundary cases, implements:

Profile likelihood intervals (preferred when feasible)
Bootstrap confidence intervals (robust for complex cases)
Modified Wald intervals with boundary adjustments

Risk Difference Interpretation

Risk differences represent absolute changes in probability. A risk difference of 0.05 means the exposed group has a 5 percentage point higher risk than the unexposed group. This is often more interpretable than relative measures (risk ratios, odds ratios) for public health decision-making.

Value

A tibble of class "riskdiff_result" containing the following columns:

exposure_var: Character. Name of exposure variable analyzed
rd: Numeric. Risk difference estimate (proportion scale, e.g. 0.05 = 5 percentage points)
ci_lower: Numeric. Lower bound of confidence interval
ci_upper: Numeric. Upper bound of confidence interval
p_value: Numeric. P-value for test of null hypothesis (risk difference = 0)
model_type: Character. Link function successfully used ("identity", "log", "logit", or error type)
n_obs: Integer. Number of observations used in analysis
on_boundary: Logical. TRUE if MLE is on parameter space boundary
boundary_type: Character. Type of boundary: "none", "upper_bound", "lower_bound", "separation", "both_bounds"
boundary_warning: Character. Warning message for boundary cases (if any)
ci_method: Character. Method used for confidence intervals ("wald", "profile", "bootstrap")
...: Additional columns for stratification variables if specified

The returned object has attributes including the original function call and alpha level used. Risk differences are on the probability scale where 0.05 represents a 5 percentage point difference.

References

Donoghoe MW, Marschner IC (2018). "logbin: An R Package for Relative Risk Regression Using the Log-Binomial Model." Journal of Statistical Software, 86(9), 1-22. doi:10.18637/jss.v086.i09

Marschner IC, Gillett AC (2012). "Relative Risk Regression: Reliable and Flexible Methods for Log-Binomial Models." Biostatistics, 13(1), 179-192.

Venzon DJ, Moolgavkar SH (1988). "A Method for Computing Profile-Likelihood-Based Confidence Intervals." Journal of the Royal Statistical Society, 37(1), 87-94.

Rothman KJ, Greenland S, Lash TL (2008). Modern Epidemiology, 3rd edition. Lippincott Williams & Wilkins.

Examples

# Simple risk difference
data(cachar_sample)
rd_simple <- calc_risk_diff(
  data = cachar_sample,
  outcome = "abnormal_screen",
  exposure = "areca_nut"
)
print(rd_simple)

# Age-adjusted risk difference
rd_adjusted <- calc_risk_diff(
  data = cachar_sample,
  outcome = "abnormal_screen",
  exposure = "areca_nut",
  adjust_vars = "age"
)
print(rd_adjusted)

# Stratified analysis with enhanced error checking and boundary detection
rd_stratified <- calc_risk_diff(
  data = cachar_sample,
  outcome = "abnormal_screen",
  exposure = "areca_nut",
  strata = "residence",
  verbose = TRUE  # See diagnostic messages and boundary detection
)
print(rd_stratified)

# Check for boundary cases
if (any(rd_stratified$on_boundary)) {
  cat("Boundary cases detected!\n")
  boundary_rows <- which(rd_stratified$on_boundary)
  for (i in boundary_rows) {
    cat("Row", i, ":", rd_stratified$boundary_type[i], "\n")
  }
}

# Force profile likelihood CIs for enhanced robustness
rd_profile <- calc_risk_diff(
  data = cachar_sample,
  outcome = "abnormal_screen",
  exposure = "areca_nut",
  boundary_method = "profile"
)

riskdiff documentation built on June 30, 2025, 9:07 a.m.

riskdiff index

Package overview README.md Causal Inference with IPTW in riskdiff complete-example Getting Started with Risk Differences technical-details

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

riskdiff
Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

calc_risk_diff: Calculate Risk Differences with Robust Model Fitting and...
In riskdiff: Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

Calculate Risk Differences with Robust Model Fitting and Boundary Detection

Description

Usage

Arguments

Details

New in Version 0.2.1: Enhanced Stability and Quality Validation

Enhanced Data Validation:

Boundary Detection and Robust Inference:

Robust Confidence Intervals:

Risk Difference Interpretation

Value

References

Examples

Related to calc_risk_diff in riskdiff...

R Package Documentation

Browse R Packages

We want your feedback!

riskdiff Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

calc_risk_diff: Calculate Risk Differences with Robust Model Fitting and... In riskdiff: Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

Calculate Risk Differences with Robust Model Fitting and Boundary Detection

Description

Usage

Arguments

Details

New in Version 0.2.1: Enhanced Stability and Quality Validation

Enhanced Data Validation:

Boundary Detection and Robust Inference:

Robust Confidence Intervals:

Risk Difference Interpretation

Value

References

Examples

Related to calc_risk_diff in riskdiff...

R Package Documentation

Browse R Packages

We want your feedback!

riskdiff
Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting

calc_risk_diff: Calculate Risk Differences with Robust Model Fitting and...
In riskdiff: Risk Difference Estimation with Multiple Link Functions and Inverse Probability of Treatment Weighting