In Health-Economics-in-R/CEdecisiontree: Cost-effectiveness decision tree analysis

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The main aim of the CEdecisiontree package is to provide a bridge to models that would be otherwise build in Excel. Because of this the transition matrix format for input arguments is the primary format. However, this is not necessarily the easiest to define, manipulate or compute with. Here we give some examples alternative formats.

Setup

Quietly load libraries.

library(CEdecisiontree)
library(readr)
library(dplyr)
library(reshape2)
library(tidyr)
library(assertthat)

Load example data from the package.

data("cost")
data("probs")

Tree structure

Generally, we can specify a tree by who the children are for each parent node. This can be more compuationally efficient.

tree <-
 list("1" = c(2,3),
      "2" =  c(4,5),
      "3" =  c(6,7),
      "4" =  c(),
      "5" =  c(),
      "6" =  c(),
      "7" =  c())
dat <-
 data.frame(node = 1:7,
            prob = c(NA, 0.2, 0.8, 0.2, 0.8, 0.2, 0.8),
            vals = c(0,10,1,10,1,10,1))
tree
dat

dectree_expected_recursive(names(tree)[1], tree, dat)

We can obtain the list of children from the probability matrix (or any other structure defining transition matrix).

transmat_to_child_list(probs)

Single long array

If we keep with flat arrays then clearly, as the size of the tree increased the sparse matrices become impractical. We can provide a long format array to address this. Let us transform the wide array used previously to demonstrate the structure and space saving.

probs_long <-
  probs %>%
  mutate('from' = rownames(.)) %>%
  melt(id.vars = "from",
       variable.name = 'to',
       value.name = 'prob') %>%
  mutate(to = as.numeric(to)) %>% 
  na.omit()

cost_long <-
  cost %>%
  mutate('from' = rownames(.)) %>%
  melt(id.vars = "from",
       variable.name = 'to',
       value.name = 'vals') %>%
  mutate(to = as.numeric(to)) %>% 
  na.omit()

dat_long <-
  merge(probs_long,
        cost_long)

dat_long

We can use the long array as the input argument instead of the separate transition matrices. Internally, we simple convert back to a matrix using long_to_transmat() so for larger trees this may be inefficient.

dectree_expected_values(
  define_model(dat_long = dat_long))

Computation speed

We can compare the computation times for the recursive and non-recursive formulations.

microbenchmark::microbenchmark(dectree_expected_values(define_model(dat_long = dat_long)),
                               dectree_expected_recursive(names(tree)[1], tree, dat), times = 100L)

For this example the recursive formulation is much quicker. Change in memory before and after running the functions.

pryr::mem_change(dectree_expected_values(  define_model(dat_long = dat_long)))
pryr::mem_change(dectree_expected_recursive(names(tree)[1], tree, dat))