rollupTree

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 6
)
library(rollupTree)

Overview

rollupTree implements a general function for computations in which some property of a parent element is a combination of corresponding properties of its child elements. The mass of an assembly, for example, is the sum of the masses of its subassemblies, and the mass of each subassembly is the sum of masses of its parts, etc.

rollupTree can perform computations specified by arbitrarily-shaped (but well-formed) trees, arbitrarily-defined properties and property-combining operations. Defaults are provided to simplify common cases (atomic numerical properties combined by summing), but functional programming techniques allow the caller to pass arbitrary update methods as required.

Despite its name, rollupTree can operate on directed acyclic graphs that are not trees if instructed to apply less stringent validation rules to its input. See default_validate_dag() and rollup().

Example With A Data Frame

Consider this example Work Breakdown Structure from WorkBreakdownStructure.com:

Example Work Breakdown Structure. Source: www.workbreakdownstructure.com

The computations in the example are worked out already; we show here how to reproduce them.

A Work Breakdown Structure is a tree, that is, a graph that is connected and acyclic. It is, in addition, directed, that is, the edges have direction. We arbitrarily chose the edge direction to go from child to parent. Finally, it is single-rooted: every vertex but one has a single parent vertex; the root vertex has no parent.

The leaf elements (vertices) of the tree require asserted values for the properties (work, budget) of interest. Property values for non-leaf elements are computed by rollup().

We begin by capturing the structure of the tree and asserted values in a data frame we call wbs_table. Values to be computed are initially unknown. Each element is uniquely identified by an id column[^1]. We also indicate parent id in the pid column but this information is not used directly by rollup().

[^1]: id is the default but any valid column name can be used. Values should be character data.

library(rollupTree)
wbs_table

A key feature of recursively-defined problems like this is that the computation for a given element must occur after properties for it children are known (either asserted or computed). Traversing a tree in this manner can be achieved by using a well-known algorithm in graph theory known as topological sort. For that reason, we construct an igraph [@igraph_manual] graph object in R, from which we can conveniently (1) check that the graph is in fact a well-formed tree, and (2) efficiently execute a topological sort to order the computations. (Note that, although the problems solved by rollup are defined recursively, the implementation in software is not recursive.)

It is a simple matter to construct a graph from the information in our data frame:

library(rollupTree)
wbs_tree <- create_rollup_tree(
  get_keys = function() wbs_table$id,
  get_parent_key_by_child_key = function(key) wbs_table[wbs_table$id == key, "pid"]
)

WBS Tree

A directed acyclic graph may have more than one topological sort; any is suitable for our purpose. Here is what rollup() uses internally:

igraph::topo_sort(wbs_tree)

Summing a Single Numeric Property

Although our data in this first example is a data frame, rollup() can operate on an arbitrary R object if provided with update and validate methods for that object. For the common case in which the parameter of interest is a numeric column in a data frame, the combine operation is addition, and the key column is named id, the package provides update_df_prop_by_id() and validate_df_by_id() functions that can be invoked as boilerplate. To roll up the work property, for example, we simply invoke:

rollup(
  tree=wbs_tree,
  ds=wbs_table,
  update=function(d, t, s) update_df_prop_by_id(df=d, target=t, sources=s, prop="work"),
  validate_ds=function(t, d) validate_df_by_id(tree=t, df=d, prop="work")
)

Chaining Rollups

If we want to roll up the budget column as well, we can simply chain two rollup()s together. In this example we use R's pipe operator:

rollup(
  tree=wbs_tree,
  ds=wbs_table,
  update=function(d, t, s) update_df_prop_by_id(df=d, target=t, sources=s, prop="work"),
  validate_ds=function(t, d) validate_df_by_id(tree=t, df=d, prop="work")
) |> rollup(
  tree=wbs_tree,
  ds=_,
  update=function(d, t, s) update_df_prop_by_id(df=d, target=t, sources=s, prop="budget"),
  validate_ds=function(t, d) validate_df_by_id(tree=t, df=d, prop="budget")
)

In most cases, this approach suffices. The code is simple and clear, and performance is not typically an issue. (In other testing rollup() performs tens of thousands of non-trivial property updates per second.) We show here some alternate approaches, mainly to illustrate architectural features of the approach that may be useful for more esoteric applications.

Chaining Update Methods

update_df_prop_by_id() (like every well-behaved update method) modifies only the specified column and leaves the rest of the data frame unchanged, so we can chain two updates within a single rollup() instead of chaining two rollup()s. Similarly, a data set validator returns a logical value, so we can make the conjunction of two validators:

rollup(
  tree = wbs_tree,
  ds = wbs_table,
  update = function(d, t, s) {
    update_df_prop_by_id(
      df = d,
      target = t,
      sources = s,
      prop = "work"
    ) |>
      update_df_prop_by_id(target = t,
                           sources = s,
                           prop = "budget")
  },
  validate_ds = function(t, d) {
    validate_df_by_id(tree = t, df = d, prop = "work") &&
      validate_df_by_id(tree = t, df = d, prop = "budget")
  }
)

Custom get, set, and update Methods

In this example we create a custom get method that builds a named vector from the specified properties (using lower-level function df_get_by_id()) and a corresponding set method (using df_set_by_id()). We then create a custom update method using these methods. (The default combine method still works because R knows how to add vectors.) Finally, we create a custom data set validator and invoke rollup() with our custom methods.

my_get <- function(d, i) c(
  w=df_get_by_id(df=d, id=i, prop="work"),
  b=df_get_by_id(df=d, id=i, prop="budget")
)
my_set <- function(d, i, v) {
  df_set_by_id(df=d, id=i, prop="work", val=v["w"]) |>
    df_set_by_id(id=i, prop="budget", val=v["b"])
}
my_update <- function(d, t, s) {
    update_prop(ds=d, target=t, sources=s, set=my_set, get=my_get)
}
my_validate <- function(t, d) {
  validate_ds(tree=t, ds=d,
               get_keys=function(d) df_get_ids(df=d),
               get_prop=my_get,
               op=function(v) my_check(v["w"]) && my_check(v["b"])
  )
}
my_check <- function(v)
  is.numeric(v) && !is.na(v) && (v > 0.0)

rollup(
  tree = wbs_tree,
  ds = wbs_table,
  update = my_update,
  validate_ds = my_validate
)

Custom combine Method

Finally, we illustrate the use of a custom combiner. Suppose we have 5% uncertainty in our leaf cost numbers. Add those uncertainty numbers to our data frame:

new_wbs_table <- wbs_table
new_wbs_table$work <- NULL
new_wbs_table$budget_unc <- ifelse(is.na(wbs_table$budget), NA, wbs_table$budget * 0.05)
new_wbs_table

The standard technique for accumulating uncertainties is to combine using root-sum-square (RSS).

combine_rss <- function(vl) {
  sqrt(Reduce(f = `+`, x = Map(
    f = function(v)
      v * v,
    vl
  )))
}
result <- rollup(
  tree = wbs_tree,
  ds = new_wbs_table,
  update = function(d, t, s)
    update_df_prop_by_id(
      df = d,
      target = t,
      sources = s,
      prop = "budget"
    ) |>
    update_df_prop_by_id(
      target = t,
      sources = s,
      prop = "budget_unc",
      combine = combine_rss
    ),
  validate_ds = function(t, d)
    validate_df_by_id(tree = t, df = d, prop = "budget_unc"),
)
result$budget_unc_pct <- result$budget_unc / result$budget * 100.
result

Example With A List

Suppose instead of a data frame we have a list of lists:

wbs_list <- lapply(split(wbs_table, wbs_table$id),
                   function(r) list(name = r$name, budget = r$budget)
)
str(wbs_list)

We construct update and validate functions as follows:

list_get <- function(d, i) d[[i]]$budget
list_set <- function(d, i, v) { d[[i]]$budget = v; d }
list_update <- function(d, t, s) { update_prop(d, t, s, list_set, list_get) }
list_validate <- function(t, d) validate_ds(t, d, get_keys = function(l) names(l), get = list_get)

The output of rollup() is as expected:

list_result <- rollup(wbs_tree, wbs_list, list_update, list_validate)
str(list_result)

Example With A Tree

The tree can serve as the dataset with proper attributes and methods. To illustrate, we add the budget figures from our data frame as vertex attributes to the tree:

library(igraph)
new_wbs_tree <- Reduce(
  f = function(g, k) set_vertex_attr(g, 'budget', k, df_get_by_id(wbs_table, k, 'budget')),
  x = names(V(wbs_tree)),
  init = wbs_tree
)
ib <- vertex_attr(new_wbs_tree, "budget")
names(ib) <- names(V(new_wbs_tree))
ib

We construct update and validate functions as follows:

tree_get <- function(d, k) vertex_attr(d, "budget", k)
tree_set <- function(d, k, v) set_vertex_attr(d, "budget", k, v)
tree_update <- function(d, t, s) update_prop(d, t, s, set = tree_set, get = tree_get)
tree_validate <- function(t, d) validate_ds(t, d, get_keys = function(d) names(V(d)), get = tree_get)

The output of rollup() is as expected:

tree_result <- rollup(new_wbs_tree, new_wbs_tree, update = tree_update, validate_ds = tree_validate)
ob <- vertex_attr(tree_result, "budget")
names(ob) <- names(V(tree_result))
ob

Simple Fault Tree Analysis

Fault tree analysis is similarly recursive. We show here a greatly-simplified [@fta] example to illustrate a particular feature of the implementation.

https://control.com/technical-articles/deep-dive-into-fault-tree-analysis/

For the purpose of illustration, we make the following simplifying assumptions:

Our initial fault properties table is:

fault_table

The edge list of the fault tree is:

igraph::E(fault_tree)

Get and set methods for this example are simple and obvious:

df_get_fault_props <- function(df, id) {
  list(
    type = df_get_by_id(df, id, "type"),
    prob = df_get_by_id(df, id, "prob")
  )
}

df_set_fault_props <- function(df, id, v) {
  df_set_by_id(df, id, "prob", v$prob)
}

In most applications (e.g., mass properties), the combining operation only the values to be combined, and is therefore passed only those values as input. In the case of a fault tree, however, the input for an AND gate are multiplied, whereas for an OR gate are summed. We can accommodate this situation by passing an optional type argument to the combiner follows:

combine_fault_props <- function(vl, type) {
  list(
    prob = Reduce(
      f = if (type == "and") "*" else "+",
      Map(f = function(v) v$prob, vl)
    )
  )
}

update_fault_props <- function(ds, parent_key, child_keys) {
  update_prop(
    ds,
    target = parent_key,
    sources = child_keys,
    set = df_set_fault_props,
    get = df_get_fault_props,
    combine = function(vl)
      combine_fault_props(vl, df_get_fault_props(ds, parent_key)$type)
  )
}

validate_fault_props <- function(fp) {
  if (fp$type != "basic") stop(sprintf("invalid leaf node type %s", fp$type))
  if (!is.numeric(fp$prob) || fp$prob < 0.0 || fp$prob > 1.0)
    stop(sprintf("invalid probability value %f", fp$prob))
  TRUE
}

validate_fault_props_table <- function(tree, df) {
  validate_ds(tree, df, df_get_ids, df_get_fault_props, validate_fault_props)
}

Calling rollup() in the usual manner yields:

rollup(fault_tree, fault_table, update_fault_props, validate_fault_props_table)

References



Try the rollupTree package in your browser

Any scripts or data that you put into this service are public.

rollupTree documentation built on Feb. 10, 2026, 5:09 p.m.