overlapping_intervals_dt: Check if the interval column in a data.table has overlapping...

assert_no_overlapping_intervals_dtR Documentation

Check if the interval column in a data.table has overlapping intervals

Description

Checks to see if the specified interval variable contains overlapping intervals.

Usage

assert_no_overlapping_intervals_dt(
  dt,
  id_cols,
  col_stem,
  identify_all_possible = FALSE,
  quiet = FALSE
)

identify_overlapping_intervals_dt(
  dt,
  id_cols,
  col_stem,
  identify_all_possible = FALSE,
  quiet = FALSE
)

Arguments

dt

[data.table()]
Data containing the interval variable to check. Should include all 'id_cols'.

id_cols

[character()]
ID columns that uniquely identify each row of dt. Should include 'col_stem_start' and 'col_stem_end'.

col_stem

[character(1)]
The name of the interval variable to check, should not include the '_start' or '_end' suffix.

identify_all_possible

[logical(1)]
Whether to return all overlapping intervals ('TRUE') or try to identify just the less granular interval ('FALSE'). Default is 'FALSE'. Useful when it may not be clear what is the less granular interval.

quiet

[logical(1)]
Should progress messages be suppressed as the function is run? Default is False.

Details

identify_overlapping_intervals_dt works by first identifying each unique set of intervals in dt. Then checks one at a time the groups of rows of dt that match each set of intervals.

Value

identify_overlapping_intervals_dt returns a [data.table()] with id_cols that have overlapping intervals. If no intervals are overlapping then a zero-row [data.table()] is returned. assert_no_overlapping_intervals_dt returns nothing but throws an error if identify_overlapping_intervals_dt returns a non-empty data.table.

Examples

input_dt <- data.table::data.table(
  age_start = seq(0, 95, 5),
  age_end = c(seq(5, 95, 5), Inf)
)
input_dt <- rbind(input_dt, data.table::data.table(age_start = c(15), age_end = c(60)))

# identify everything that is overlapping
overlapping_dt <- identify_overlapping_intervals_dt(
  dt = input_dt,
  id_cols = c("age_start", "age_end"),
  col_stem = "age",
  identify_all_possible = TRUE
)

# identify only the largest overlapping intervals
overlapping_dt <- identify_overlapping_intervals_dt(
  dt = input_dt,
  id_cols = c("age_start", "age_end"),
  col_stem = "age",
  identify_all_possible = FALSE
)


ihmeuw-demographics/hierarchyUtils documentation built on June 20, 2024, 7:18 a.m.