helper_common_intervals: Helper functions for collapsing to the most detailed common...

identify_common_intervalsR Documentation

Helper functions for collapsing to the most detailed common intervals

Description

identify_common_intervals() identifies the most detailed common set of intervals for a given interval variable and merge_common_intervals() merges these on to the original dataset. collapse_common_intervals() calls both these functions internally to help collapse to the most detailed common intervals.

Usage

identify_common_intervals(dt, id_cols, col_stem, include_missing = FALSE)

merge_common_intervals(dt, common_intervals, col_stem)

Arguments

dt

[data.table()]
Dataset containing the interval variable.

id_cols

[character()]
ID columns that uniquely identify each row of dt. If 'NULL' then common intervals across entire dataset are identified.

col_stem

[character(1)]
The name of the variable to collapse, should not include the '_start' or '_end' suffix for the interval variable.

include_missing

[logical(1)]
Whether to include missing intervals in the identified most detailed common intervals. These missing intervals are not present in all combinations of id_cols. Default is "FALSE".

common_intervals

[data.table()]
Common intervals returned by identify_common_intervals()

Value

identify_common_intervals() returns a [data.table()] with two columns called 'col_stem_start' and 'col_stem_end' defining the most detailed common set of intervals for the col_stem interval variable.

identify_common_intervals() returns a [data.table()] with the same columns and rows as originally in dt, with two additional columns merged on from common_intervals. These new columns are called 'common_start' and 'common_end' defining the most detailed common interval each row maps to.

Examples

id_cols <- c("year_start", "year_end", "sex", "age_start", "age_end")

# set up test input data.table
input_dt_male <- data.table::CJ(year_start = 2005, year_end = 2010,
                                sex = "male",
                                age_start = seq(0, 95, 5),
                                value = 25)
input_dt_male[age_start == 95, value := 5]
input_dt_female <- data.table::CJ(year_start = 2005:2009,
                                  sex = "female",
                                  age_start = seq(0, 95, 1),
                                  value = 1)
gen_end(input_dt_female, setdiff(id_cols, c("year_end", "age_end")),
        col_stem = "year", right_most_endpoint = 2010)
input_dt <- rbind(input_dt_male, input_dt_female)
gen_end(input_dt, setdiff(id_cols, "age_end"), col_stem = "age")
data.table::setkeyv(input_dt, id_cols)

common_intervals <- hierarchyUtils:::identify_common_intervals(
  dt = input_dt,
  id_cols = id_cols,
  col_stem = "year"
)
data.table::setnames(common_intervals, c("year_start", "year_end"),
                     c("common_start", "common_end"))

result_dt <- hierarchyUtils:::merge_common_intervals(
  dt = input_dt,
  common_intervals = common_intervals,
  col_stem = "year"
)


ihmeuw-demographics/hierarchyUtils documentation built on June 20, 2024, 7:18 a.m.