calculate_dsr: Calculate Directly Standardised Rates using calculate_dsr

View source: R/DSR.R

calculate_dsrR Documentation

Calculate Directly Standardised Rates using calculate_dsr

Description

Calculates directly standardised rates with confidence limits using Byar's method (1) with Dobson method adjustment (2) including option to further adjust confidence limits for non-independent events (3).

Usage

calculate_dsr(
  data,
  x,
  n,
  stdpop = NULL,
  type = "full",
  confidence = 0.95,
  multiplier = 1e+05,
  independent_events = TRUE,
  eventfreq = NULL,
  ageband = NULL
)

Arguments

data

data frame containing the data to be standardised, pre-grouped if multiple DSRs required; unquoted string; no default

x

field name from data containing the observed number of events for each standardisation category (eg ageband) within each grouping set (eg area); unquoted string; no default

n

field name from data containing the populations for each standardisation category (eg ageband) within each grouping set (eg area); unquoted string; no default

stdpop

field name from data containing the standard populations for each age band; unquoted string; no default

type

defines the data and metadata columns to be included in output; can be "value", "lower", "upper", "standard" (for all data) or "full" (for all data and metadata); quoted string; default = "full"

confidence

the required level of confidence expressed as a number between 0.9 and 1 or a number between 90 and 100 or can be a vector of 0.95 and 0.998, for example, to output both 95 percent and 99.8 percent percent CIs; numeric; default 0.95

multiplier

the multiplier used to express the final values (eg 100,000 = rate per 100,000); numeric; default 100,000

independent_events

whether events are independent. Set to TRUE for independent events. When set to FALSE an adjustment is made to the confidence intervals - to do this, the dataset provided must include event frequency breakdowns and column x is redefined as the number of unique individuals who experienced each frequency of event, rather than the total number of events.

eventfreq

field name from data containing the event frequencies. Only required when independent_events = FALSE; unquoted string; default NULL

ageband

field name from data containing the age bands for standardisation. Only required when independent_events = FALSE; unquoted string; default NULL

Value

When type = "full", returns a tibble of total counts, total populations, directly standardised rates, lower confidence limits, upper confidence limits, confidence level, statistic and method for each grouping set. Use the type argument to limit the columns output.

Notes

For total counts >= 10 Byar's method (1) is applied using the internal byars_lower and byars_upper functions. When the total count is < 10 DSRs are not reliable and will therefore be suppressed in the output.

References

(1) Breslow NE, Day NE. Statistical methods in cancer research, volume II: The design and analysis of cohort studies. Lyon: International Agency for Research on Cancer, World Health Organisation; 1987.

(2) Dobson A et al. Confidence intervals for weighted sums of Poisson parameters. Stat Med 1991;10:457-62.

(3) See the DSR chapter of the Fingertips Public Health Technical Guidance

See Also

Other PHEindicatormethods package functions: assign_funnel_significance(), calculate_ISRate(), calculate_ISRatio(), calculate_funnel_limits(), calculate_funnel_points(), phe_dsr(), phe_life_expectancy(), phe_mean(), phe_proportion(), phe_quantile(), phe_rate(), phe_sii()

Examples

library(dplyr)
df <- data.frame(
  indicatorid = rep(c(1234, 5678, 91011, 121314), each = 19 * 2 * 5),
  year = rep(2006:2010, each = 19 * 2),
  sex = rep(rep(c("Male", "Female"), each = 19), 5),
  ageband = rep(c(0,5,10,15,20,25,30,35,40,45,
                  50,55,60,65,70,75,80,85,90), times = 10),
  obs = sample(200, 19 * 2 * 5 * 4, replace = TRUE),
  pop = sample(10000:20000, 19 * 2 * 5 * 4, replace = TRUE),
  esp2013 = rep(esp2013, 40)
)

## Example 1 - Default execution
df %>%
  group_by(indicatorid, year, sex) %>%
  calculate_dsr(obs, pop, stdpop = esp2013)

## Example 2 - Calculate both 95% and 99.8% CIs in single execution
df %>%
  group_by(indicatorid, year, sex) %>%
  calculate_dsr(obs, pop, stdpop = esp2013, confidence = c(0.95, 0.998))

## Example 3 - Drop metadata columns from the output
df %>%
  group_by(indicatorid, year, sex) %>%
  calculate_dsr(obs, pop, stdpop = esp2013, type = "standard")

## Example 4 - Calculate DSRs for non-independent events

library(tidyr)

# For non-independent events the input data frame must breakdown events into
# counts of unique individuals by event frequency. The code chunk below
# creates a dummy data frame in this required format. Note that assignment of
# 10%, 20% and 70% of events to each event frequency is purely to create a
# data frame in the required format whilst retaining the same total event and
# population distributions by group and age band as example 1 to allow
# comparison of the outputs.

df_freq <- df %>%
  mutate(
    f3 = floor((obs * 0.1)/3),         # 10 % of events in individuals with 3 events
    f2 = floor((obs * 0.2)/2),         # 20 % of events in individuals with 2 events
    f1 = (obs - (3 * f3) - (2 * f2))   # 70% of events in individuals with 1 event
  ) %>%
  select(!"obs") %>%
  pivot_longer(
    cols = c("f1", "f2", "f3"),
    names_to = "eventfrequency",
    values_to = "uniqueindividuals",
    names_prefix = "f"
  ) %>%
  mutate(eventfrequency = as.integer(eventfrequency))

# Calculate the dsrs - notice that output DSR values match those in
# example 1 but the confidence intervals are wider

df_freq %>%
  group_by(indicatorid, year, sex) %>%
  calculate_dsr(
    x = uniqueindividuals,
    n = pop,
    stdpop = esp2013,
    independent_events = FALSE,
    eventfreq = eventfrequency,
    ageband = ageband
  )



publichealthengland/PHEindicatormethods documentation built on Dec. 15, 2024, 3:18 p.m.