knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The objective of this package is to compute rates adjusted by a reference population or other rate. This is a very common procedure in epidemiology, allowing the comparison of rates of a event (like mortality) among groups that have different age distributions.
Some packages like the epitools
compute these adjusted rates. This package functions wraps the epitools
functions in a tidy way, allowing the computation of age adjusted rates for several groups using key variables, like year and regions for example.
devtools::install_github("rfsaldanha/tidyrates")
library(tidyrates)
Let's use the Fleiss dataset, quoted by the epitools
package (Fleiss, 1981, p. 249 ).
population <- c(230061, 329449, 114920, 39487, 14208, 3052, 72202, 326701, 208667, 83228, 28466, 5375, 15050, 175702, 207081, 117300, 45026, 8660, 2293, 68800, 132424, 98301, 46075, 9834, 327, 30666, 123419, 149919, 104088, 34392, 319933, 931318, 786511, 488235, 237863, 61313) population <- matrix(population, 6, 6, dimnames = list(c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), c("1", "2", "3", "4", "5+", "Total"))) count <- c(107, 141, 60, 40, 39, 25, 25, 150, 110, 84, 82, 39, 3, 71, 114, 103, 108, 75, 1, 26, 64, 89, 137, 96, 0, 8, 63, 112, 262, 295, 136, 396, 411, 428, 628, 530) count <- matrix(count, 6, 6, dimnames = list(c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), c("1", "2", "3", "4", "5+", "Total")))
population
count
The Fleiss data present events (count
object) and population (population
object) for six age groups on five different groups (from 1 to 5+).
The tidyrates
package present the same Fleiss data in a tidy way, with a tibble in long format.
fleiss_data
The key
variable refers to the groups, age_group
to the age groups, name
separates the values
into events and population.
You may use this same structure for your use case data.
The Fleiss example uses the average population as standard population reference.
standard<-apply(population[,-6], 1, mean) standard
Using tidyrates
, we must supply a tibble with two variables: age group and population.
standard_pop <- tibble::tibble( age_group = c("Under 20", "20-24", "25-29", "30-34", "35-39", "40 and over"), population = c(63986.6, 186263.6, 157302.2, 97647.0, 47572.6, 12262.6) )
To use the direct adjustment procedure, tidyrate
present the rate_adj_direct
function. The .data
argument must be a tibble with the events and population data, and the .std
argument must be standard population tibble. The .keys
argument must point to grouping variables on the .data
tibble, if available.
The rate_adj_direct
will compute the crude rate, adjusted rate and exact confidence intervals for each group.
rate_adj_direct(fleiss_data, .std = standard_pop, .keys = "key")
Let's use the Selvin dataset, quoted by the epitools
package (Selvin, 2004).
dth40 <- c(45, 201, 320, 670, 1126, 3160, 9723, 17935, 22179, 13461, 2238) pop40 <- c(906897, 3794573, 10003544, 10629526, 9465330, 8249558, 7294330, 5022499, 2920220, 1019504, 142532)
The tidyrates
present the same dataset in a tidy way.
selvin_data_1940
dth60 <- c(141, 926, 1253, 1080, 1869, 4891, 14956, 30888, 41725, 26501, 5928) pop60 <- c(1784033, 7065148, 15658730, 10482916, 9939972, 10563872, 9114202, 6850263, 4702482, 1874619, 330915)
The tidyrates
present the same dataset in a tidy way.
selvin_data_1960
To use the indirect adjustment procedure, tidyrate
present the rate_adj_indirect
function. The .data
argument must be a tibble with the events and population data, and the .std
argument must be also a tibble with the events and population data. The .keys
argument must point to grouping variables on the .data
tibble, if available.
The rate_adj_indirect
will compute the crude rate, adjusted rate and exact confidence intervals for each group.
rate_adj_indirect(selvin_data_1940, selvin_data_1960)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.