agg_lt: Aggregate life table(s) to less granular age groups
In ihmeuw-demographics/demCore: Core Functions for Demography

agg_lt

R Documentation

Aggregate life table(s) to less granular age groups

Description

Aggregate life table(s) to less granular age groups using standard life table aggregation functions of qx (and ax).

Usage

agg_lt(dt, id_cols, age_mapping, quiet = F, ...)

Arguments

`dt`	[`data.table()`] Life table to be aggregated. Must include all columns in `id_cols`, and at least two of 'qx', 'ax', and 'mx', or just 'qx'.
`id_cols`	[`character()`] ID columns that uniquely identify each row of `dt`.
`age_mapping`	[`data.table()`] Specification of intervals to aggregate to. Required columns are 'age_start' and 'age_end'. Use "Inf" as 'age_end' for terminal age group. The age group intervals must be contiguous and cover the entire interval specified in the input life tables `dt`.
`quiet`	[`logical(1)`] Should progress messages be suppressed as the function is run? Default is False.
`...`	Other arguments to pass to `hierarchyUtils::agg()`.

Details

See the references page for the formatted equations below.

This function works by aggregating the qx and ax life table parameters separately. If only qx is included in dt then ax aggregation is not done.

qx aggregation:

To explain how qx is aggregated it is useful to define a couple of different events:

D = \text{death between age } x \text{ and } x + n

D' = \text{survival between age } x \text{ and } x + n

S = \text{survival to age } x

Now qx and px can be written in terms of events D and S.

{}_{n}q_x = P(D | S)

{}_{n}p_x = P(D' | S)

Now say there are multiple sub age-groups that make up the overall age group between x \text{ and } x + n. Let subscripts "1" and "2" indicate values specific to the first and second sub age-groups and assume values with no subscript apply to the original aggregate age group. The first sub age-group could be between x \text{ and } x + n_1 and the second between x + n_1 \text{ and } x + n_1 + n_2, where n_1 + n_2 = n.

The overall px value can be written as a function of the sub age-group's px values.

P(D'|S) = P(D'_1|B_1) \cap P(D'_2|B_2) = P(D'_1|B_1) * P(D'_2|B_2)

where:

P(D'_1 | B_1) = \text{survival between age } x \text{ and } x + n_1 \text{ given survival to age } x

P(D'_2 | B_2) = \text{survival between age } x + n_1 \text{ and } x + n \text{ given survival to age } x + n_1

More generally if there are A age groups between age x \text{ and } x + n, and i indexes each of the sub age intervals then:

{}_{n}p_x = \prod_{i=1}^{A} {}_{n_i}p_{x_i} = \prod_{i=1}^{A}(1 - {}_{n_i}q_{x_i})

{}_{n}q_x = 1 - {}_{n}p_x

ax aggregation:

{}_{n}a_x is aggregated across age groups by aggregating the number of person-years lived in each age group by those who died in the interval.

{}_{n}a_x \cdot {}_{n}d_x = \text{person-years lived between age } x \text{ and } x + n \text{ by those who died in this age interval}

where:

{}_{n}a_x = \text{average years lived between age } x \text{ and } x + n \text{ by those who died in the age interval}

{}_{n}d_x = \text{number that died between age } x \text{ and } x + n

Now say there are A age groups between age x \text{ and } x + n, and i indexes each of the sub age intervals. The total number of person-years lived by those who died in the aggregate age group is a simple sum of the number of person-years lived in each sub age interval.

{}_{n}a_x \cdot {}_{n}d_x = \sum_{i = 1}^{A} ((x_i - x) + {}_{n_i}a_{x_i}) \cdot {}_{n_i}d_{x_i}

where:

x_i - x = \text{ number of complete person years lived in the previous sub age intervals by someone who dies in sub age interval } i

The aggregate ax can then be solved for.

{}_{n}a_x = \frac{\sum_{i = 1}^{A} ((x_i - x) + {}_{n_i}a_{x_i}) \cdot {}_{n_i}d_{x_i}}{\sum_{i = 1}^{A} {}_{n_i}d_{x_i}}

Value

[data.table()]
Aggregated life table(s) with columns for all id_cols. A column for 'qx' is always included, a column for 'ax' will also be returned if two of 'qx', 'ax', and 'mx' are included in the input dt. Will only return the age groups specified in age_mapping.

Severity Arguments

missing_dt_severity:

Check for missing levels of col_stem, the variable being aggregated or scaled over.

stop: throw error (this is the default).
warning or message: throw warning/message and continue with aggregation/scaling for requested aggregations/scalings where expected input data in dt is available.
none: don't throw error or warning, continue with aggregation/scaling for requested aggregations/scalings where expected input data in dt is available.
skip: skip this check and continue with aggregation/scaling.

present_agg_severity (agg only):

Check for requested aggregates in mapping that are already present

stop: throw error (this is the default).
warning or message: throw warning/message, drop aggregates and continue with aggregation.
none: don't throw error or warning, drop aggregates and continue with aggregation.
skip: skip this check and add to the values already present for the aggregates.

na_value_severity:

Check for 'NA' values in the value_cols.

stop: throw error (this is the default).
warning or message: throw warning/message, drop missing values and continue with aggregation/scaling where possible (this likely will cause another error because of missing_dt_severity, consider setting missing_dt_severity = "skip" for functionality similiar to na.rm = TRUE).
none: don't throw error or warning, drop missing values and continue with aggregation/scaling where possible (this likely will cause another error because of missing_dt_severity, consider setting missing_dt_severity = "skip" for functionality similiar to na.rm = TRUE).
skip: skip this check and propagate NA values through aggregation/scaling.

overlapping_dt_severity: Check for overlapping intervals that prevent collapsing to the most detailed common set of intervals. Or check for overlapping intervals in col_stem when aggregating/scaling.

stop: throw error (this is the default).
warning or message: throw warning/message, drop overlapping intervals and continue with aggregation/scaling where possible (this may cause another error because of missing_dt_severity).
none: don't throw error or warning, drop overlapping intervals and continue with aggregation/scaling where possible (this may cause another error because of missing_dt_severity).
skip: skip this check and continue with aggregation/scaling.

Examples

dt <- data.table::data.table(
  age_start = c(0:110),
  age_end = c(1:110, Inf),
  location = "Canada",
  qx = c(rep(.2, 110), 1),
  ax = .5
)
id_cols = c("age_start", "age_end", "location")
dt <- agg_lt(
  dt = dt,
  id_cols = id_cols,
  age_mapping = data.table::data.table(
    age_start = seq(0, 105, 5),
    age_end = seq(5, 110, 5)
  )
)

ihmeuw-demographics/demCore documentation built on Feb. 24, 2024, 11:05 p.m.