death_113_count: Summarize NCHS 113 causes of deaths

View source: R/death_functions.R

death_113_countR Documentation

Summarize NCHS 113 causes of deaths

Description

Generate death counts for the National Center for Health Statistics (NCHS) 113 Selected Causes of Death (COD). Needs line-level death data with a properly formatted ICD10 column.

In addition to the causes of death you specify with causeids or cause, it will automatically return the total deaths as well as COVID-19 deaths (since they do not have their own NCHS category).

See rads::death_113() for a complete list of available causesid and cause values.

Usage

death_113_count(
  ph.data,
  causeids = seq(1, 113, 1),
  cause = NULL,
  icdcol = "underlying_cod_code",
  kingco = TRUE,
  group_by = NULL,
  ypll_age = NULL,
  death_age_col = NULL
)

Arguments

ph.data

a data.table or data.frame. Must contain death data structured with one person per row and with at least one column of ICD10 death codes.

causeids

an integer vector of length >=1 & <= 114, with a minimum value of 1 and a maximum value of 114.

The default is 1:113, i.e., the standard panel of WA DOH / NCHS 113 causes of death.

cause

an OPTIONAL character vector specifying the complete or partial keyword for the cause of death of interest. It is not case sensitive and you can specify it in two ways: 1) cause = c('viral', 'cough') or 2) cause = c("viral|cough"). If you specify any keyword(s), the function will ignore the causeids argument.

The default is NULL, i.e., the function will rely on the causeids argument to identify the causes of death.

icdcol

a character vector of length one that specifies the name of the column in ph.data that contains the ICD10 death codes of interest.

The default is underlying_cod_code, which is found in the properly formatted death data obtained using the get_data_death() function.

kingco

a logical vector of length one. It specifies whether you want to limit the analysis to King County.

NOTE this only works with data imported with the get_data_death() function because it needs the logical variable chi_geo_kc.

The default is kingco = TRUE.

group_by

a character vector of indeterminate length. This is used to specify all the variables by which you want to group (a.k.a. stratify) the results. For example, if you specified group_by = c('chi_sex', 'chi_race_6'), the results would be stratified by each combination of sex and race.

The default is group_by = NULL

ypll_age

an optional numeric vector of length 1. When specified, it should be the age (an integer) used for Years of Potential Life Lost (YPLL) calculations. Valid values are between 1 & 99 (inclusive), though 65 and 85 are the most common. For example, ypll_age = 65 would sum the total number of years that could have been lived had everyone in the data lived to at least 65. Note that this function returns the total number of YPLL. Additional processing is necessary to calculate rates per 100,000.

The default is ypll_age = NULL, which will skip YPLL calculations.

death_age_col

an optional character vector of length one that specifies the name of the column in ph.data with the decedents' age at death in years. It is only needed if ypll_age is specified AND if ph.data lacks a column named chi_age.

The default is death_age_col = NULL.

Details

There are actually 114 rows, with causeid 114 being the official CDC version of causeid 95 (Residual), i.e., All other diseases (Residual). Causeid 95 was intentionally changed to match the definition used by WA DOH. You can get results for any or all of the 113(+1) causes of death using the causeids or cause arguments.

Value

Generates a table with three columns, causeid, cause.of.death, and deaths. If ypll_age is specified, a ypll_## column will also be added to the table. Columns in the group_by argument will also be returned.

By default, it will return all 113 causes of death. You can specify which causes of death you want to assess using the causeids or cause arguments.

Note

Calls upon rads::death_xxx_count.

References

https://www.cdc.gov/nchs/data/dvs/Part9InstructionManual2020-508.pdf & https://secureaccess.wa.gov/doh/chat/Content/FilesForDownload/CodeSetDefinitions/NCHS113CausesOfDeath.pdf

Examples

# example 1: death count only
set.seed(98104)
deathdata <- data.table::data.table(
  cod.icd10 = c(rep("A85.2", round(runif(1, 30, 100000), 0)),
                rep("B51", round(runif(1, 30, 100000), 0)),
                rep("U071", round(runif(1, 30, 100000), 0)),
                rep("E44", round(runif(1, 30, 100000), 0)),
                rep("E62", round(runif(1, 30, 100000), 0)),
                rep("G00", round(runif(1, 30, 100000), 0)),
                rep("J10", round(runif(1, 30, 100000), 0)),
                rep("J15", round(runif(1, 30, 100000), 0)),
                rep("V874", round(runif(1, 30, 100000), 0)))
)
eg1 <- death_113_count(ph.data = deathdata,
                       causeids = seq(1, 113, 1),
                       cause = NULL,
                       icdcol = "cod.icd10",
                       kingco = FALSE,
                       ypll_age = NULL,
                       death_age_col = NULL)
head(eg1)

# example 2: with YPLL calculation
deathdata2 <- data.table::copy(deathdata)
set.seed(98104)
deathdata2[, ageofdeath := rads::round2(rnorm(1, mean = 70, sd = 5 ), 0),
           1:nrow(deathdata2)] # synthetic age of death
eg2 <- death_113_count(ph.data = deathdata2,
                       causeids = seq(1, 113, 1),
                       cause = NULL,
                       icdcol = "cod.icd10",
                       kingco = FALSE,
                       ypll_age = 65,
                       death_age_col = "ageofdeath")
head(eg2)


PHSKC-APDE/rads documentation built on April 14, 2025, 10:47 a.m.