aggregate_from_children_to_parents: Aggregate from children to parents using a hierarchy

View source: R/aggregate_parents_from_children.R

aggregate_from_children_to_parentsR Documentation

Aggregate from children to parents using a hierarchy

Description

Aggregate iteratively from leaf nodes up through an (assumed MECE) hierarchy to the top level. Retain all child and parent values, e.g. for a location hierarchy, retain all location_ids, and aggregate values up to the top specified parent level. This function is designed to be used iteratively, starting at the leaf nodes and working up to the top level. It will aggregate all children of a parent, then aggregate those parents up to the next level, and so on. Aggregation will stop at each level if aggregates are not square. If a parent location already exists in the data, this will check for all.equal() between the parent and the aggregated children, and message if v_verbose = TRUE, and throw an error if aa_hard_stop = TRUE.

Usage

aggregate_from_children_to_parents(
  DT,
  varnames_to_aggregate,
  varnames_to_aggregate_by,
  varname_weights = NULL,
  hierarchy,
  hierarchy_id = "location_id",
  stop_level = 3L,
  require_square = TRUE,
  require_rows = TRUE,
  verbose = TRUE,
  v_verbose = FALSE,
  tolerance_all_equal = NULL,
  aa_hard_stop = FALSE
)

Arguments

DT

[data.table] e.g. some data table with hierarchy_id as a column

varnames_to_aggregate

[chr] e.g. c("mean", "upper", "lower")

varnames_to_aggregate_by

[chr] e.g c("year_id", "age_group_id")

varname_weights

[chr] (default NULL) - if you want to weight the aggregation by a variable, e.g. population. If NULL, do a simple children-to-parent sum the values in varnames_to_aggregate within each combination of varnames_to_aggregate_by. If not NULL, calculate weights for all children of each parent before aggregation. Weights sum to 1 between all children, within each combination of varnames_to_aggregate_by.

hierarchy

[data.table] e.g. a location hierarchy with required columns: 'hierarchy_id', path_to_top_parent, level, most_detailed

hierarchy_id

[chr] What variable does your hierarchy define, e.g. "location_id" (2024-11-21 only supported option)

stop_level

[x] (default 3L) Stops aggregation when the child level == stop_level (e.g. 3L aggregate up to national for locations, but no further; regional scalars mean regions are larger than combined countries under them from e.g. small islands)

require_square

[lgl] (default TRUE) If TRUE, will check inputs and outputs for square (i.e. all variables are present for all combinations of

require_rows

[lgl] (default TRUE) If TRUE, assert_squarec checks data has > 0 rows

verbose

[lgl] message each parent and children being aggregated?

v_verbose

[lgl] message each parent that is not all.equal() to its aggregated children (if parent already exists in the dataset)?

tolerance_all_equal

[dbl] (Default NULL uses all.equal's defaults) Tolerance for all.equal mean relative differnce check between parent and aggregated children (if parent is already in DT). A value of 1 means the aggregated children are double the value of the parent (you probably did something wrong). Use large values for large allowance in differnces due to rounding, etc. Adjust the tolerance to your operation's mathematical limitations.

aa_hard_stop

[lgl] (default FALSE) If TRUE, will stop if a parent is not all.equal() to its aggregated children, within user-specified level of tolerance.

Details

Relies on the 'children_of_parents()' function to find children of a parent hierarchy_id e.g. location_id, then aggregates the selected columns for all children of one parent.

Value

[data.table] aggregated data.table


epi-sam/SamsElves documentation built on June 12, 2025, 7 a.m.