calculate_dsr | R Documentation |
Calculates directly standardised rates with confidence limits using Byar's method (1) with Dobson method adjustment (2) including option to further adjust confidence limits for non-independent events (3).
calculate_dsr(
data,
x,
n,
stdpop = NULL,
type = "full",
confidence = 0.95,
multiplier = 1e+05,
independent_events = TRUE,
eventfreq = NULL,
ageband = NULL
)
data |
data frame containing the data to be standardised, pre-grouped if multiple DSRs required; unquoted string; no default |
x |
field name from data containing the observed number of events for each standardisation category (eg ageband) within each grouping set (eg area); unquoted string; no default |
n |
field name from data containing the populations for each standardisation category (eg ageband) within each grouping set (eg area); unquoted string; no default |
stdpop |
field name from data containing the standard populations for each age band; unquoted string; no default |
type |
defines the data and metadata columns to be included in output; can be "value", "lower", "upper", "standard" (for all data) or "full" (for all data and metadata); quoted string; default = "full" |
confidence |
the required level of confidence expressed as a number between 0.9 and 1 or a number between 90 and 100 or can be a vector of 0.95 and 0.998, for example, to output both 95 percent and 99.8 percent percent CIs; numeric; default 0.95 |
multiplier |
the multiplier used to express the final values (eg 100,000 = rate per 100,000); numeric; default 100,000 |
independent_events |
whether events are independent. Set to TRUE for independent events. When set to FALSE an adjustment is made to the confidence intervals - to do this, the dataset provided must include event frequency breakdowns and column x is redefined as the number of unique individuals who experienced each frequency of event, rather than the total number of events. |
eventfreq |
field name from data containing the event frequencies. Only required when independent_events = FALSE; unquoted string; default NULL |
ageband |
field name from data containing the age bands for standardisation. Only required when independent_events = FALSE; unquoted string; default NULL |
When type = "full", returns a tibble of total counts, total populations, directly standardised rates, lower confidence limits, upper confidence limits, confidence level, statistic and method for each grouping set. Use the type argument to limit the columns output.
For total counts >= 10 Byar's method (1) is applied using the internal byars_lower and byars_upper functions. When the total count is < 10 DSRs are not reliable and will therefore be suppressed in the output.
(1) Breslow NE, Day NE. Statistical methods in cancer research,
volume II: The design and analysis of cohort studies. Lyon: International
Agency for Research on Cancer, World Health Organisation; 1987.
(2)
Dobson A et al. Confidence intervals for weighted sums of Poisson
parameters. Stat Med 1991;10:457-62.
(3) See the DSR chapter of the
Fingertips Public Health Technical Guidance
Other PHEindicatormethods package functions:
assign_funnel_significance()
,
calculate_ISRate()
,
calculate_ISRatio()
,
calculate_funnel_limits()
,
calculate_funnel_points()
,
phe_dsr()
,
phe_life_expectancy()
,
phe_mean()
,
phe_proportion()
,
phe_quantile()
,
phe_rate()
,
phe_sii()
library(dplyr)
df <- data.frame(
indicatorid = rep(c(1234, 5678, 91011, 121314), each = 19 * 2 * 5),
year = rep(2006:2010, each = 19 * 2),
sex = rep(rep(c("Male", "Female"), each = 19), 5),
ageband = rep(c(0,5,10,15,20,25,30,35,40,45,
50,55,60,65,70,75,80,85,90), times = 10),
obs = sample(200, 19 * 2 * 5 * 4, replace = TRUE),
pop = sample(10000:20000, 19 * 2 * 5 * 4, replace = TRUE),
esp2013 = rep(esp2013, 40)
)
## Example 1 - Default execution
df %>%
group_by(indicatorid, year, sex) %>%
calculate_dsr(obs, pop, stdpop = esp2013)
## Example 2 - Calculate both 95% and 99.8% CIs in single execution
df %>%
group_by(indicatorid, year, sex) %>%
calculate_dsr(obs, pop, stdpop = esp2013, confidence = c(0.95, 0.998))
## Example 3 - Drop metadata columns from the output
df %>%
group_by(indicatorid, year, sex) %>%
calculate_dsr(obs, pop, stdpop = esp2013, type = "standard")
## Example 4 - Calculate DSRs for non-independent events
library(tidyr)
# For non-independent events the input data frame must breakdown events into
# counts of unique individuals by event frequency. The code chunk below
# creates a dummy data frame in this required format. Note that assignment of
# 10%, 20% and 70% of events to each event frequency is purely to create a
# data frame in the required format whilst retaining the same total event and
# population distributions by group and age band as example 1 to allow
# comparison of the outputs.
df_freq <- df %>%
mutate(
f3 = floor((obs * 0.1)/3), # 10 % of events in individuals with 3 events
f2 = floor((obs * 0.2)/2), # 20 % of events in individuals with 2 events
f1 = (obs - (3 * f3) - (2 * f2)) # 70% of events in individuals with 1 event
) %>%
select(!"obs") %>%
pivot_longer(
cols = c("f1", "f2", "f3"),
names_to = "eventfrequency",
values_to = "uniqueindividuals",
names_prefix = "f"
) %>%
mutate(eventfrequency = as.integer(eventfrequency))
# Calculate the dsrs - notice that output DSR values match those in
# example 1 but the confidence intervals are wider
df_freq %>%
group_by(indicatorid, year, sex) %>%
calculate_dsr(
x = uniqueindividuals,
n = pop,
stdpop = esp2013,
independent_events = FALSE,
eventfreq = eventfrequency,
ageband = ageband
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.