In andreashandel/modeldiagram: Easy Creation of ggplot2 Based Figures of Flow Diagrams

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

# avoid check where vignette filename must equal the vignette title
options(rmarkdown.html_vignette.check_title = FALSE)

library(modeldiagram)

Introduction

Typical workflow

We anticipate the typical workflow to look something like this:

dfs <- make_dataframes(input_list = my_model_str)
diagram_plot <- make_diagram(df_list = dfs)
ggplot_code <- write_diagram_code(df_list = dfs)

# print the plot to screen
diagram_plot

# write the code to the console
cat(ggplot_code)

where my_model_str is an appropriate listed model structure (see below). The package functions and their outputs are described in detail below.

Input structure

The first step to making a diagram is to have a dynamic model encoded in an appropriate structure. The modeldiagram package was developed to work well with the modelbuilder package structure for dynamic models. We will use a modelbuilder object for demonstration here. The object is called mbsir, which is part of the modeldiagram package and is a modelbuilder list object representing a Susceptible, Infected, Recovered (SIR) disease transmission model.

The mbsir object is a list of lists:

load('mbsir.rda')
str(mbsir, max.level = 1)

The most important aspects are the variables mbsir[["var"]] and the parameters mbsir[["par"]]. The variables list contains the flows into (with "+" sign) and out of (with "-" sign) the variable. For example, here is the S variable information:

mbsir[["var"]][[1]]

From this information, we can see that there are no flows into the S variable, only out: r mbsir[["var"]][[1]]$flows.

Data frames for the diagram

modeldiagram generates data frames from the model structure that will be used in ggplot2 to construct the diagram. Five (5) data frames are produced based on the model structure:

EXPLAIN WHICH NEED TO BE PRESENT AND WHICH CAN BE MISSING/EMPTY

nodes: a data frame with four columns - ALSO ROW?
- id: the node (variable) id number, used to match with the edge data frames
- label: the label of the node (variable), e.g. "S" for susceptible
- x: the x position of the node box CENTER OR CORNER?
- y: the y position of the node box
horizontal_edges: a data frame with 9 columns
- to: the id of the node to which the flow terminates (e.g., arrow points toward this node)
- from: the id of the node from which the flow origins (e.g., arrow points away this node)
- label: the label of the flow to appear next to the line segment, typically an equation
- xstart: the starting x position of the line segment
- ystart: the starting y position of the line segment
- xend: the ending x position of the line segment
- yend: the ending y position of the line segment
- xmid: the x midpoint of the line segment, for intersections with other segments due to interactions
- ymid: the y midpoint of the line segment, for intersections with other segments due to interactions
vertical_edges: a data frame with 9 columns (same as horizontal_edges)
curved_edges: a data frame with 9 columns (same as horizontal_edges)
feedback_edges: a data frame with 9 columns (same as horizontal_edges)

The package attempts to make intelligent choices on which flows belong in which data frame. For example, flows into or out of the system with only one node intersection are typically vertical. And flows that bypass a node are typically curved. The user may need to make modifications to suite their particular needs and style.

The data frames above are returned as a list. In some cases, many of the data frames will be empty, as is this case in this example with a relatively simple model.

EDGES SHOWN BELOW HAVE COLUMNS NOT SHOWN/DESCRIBED ABOVE

inputs <- make_diagram_inputs(mbmodel = mbsir)
diagram_dfs <- make_dataframes(input_list = inputs)
diagram_dfs

Make the diagram

You can make a simple diagram using the default settings. If accepting all the default aesthetics, you only need to provide the list of dataframes, diagram_dfs.

diagram_plot <- make_diagram(df_list = diagram_dfs)
diagram_plot

The modeldiagram package displays interaction terms by default. In the model above, the $bSI$ term is an interaction between S and I that determines the physical flow from S to I. The dashed arrow shows the interaction, which connects with the main physical flow. Some users may prefer simpler diagrams that gloss over the interaction term. This can be done by setting the argument interaction_label = FALSE (default is interaction_label = TRUE.

make_diagram(df_list = diagram_dfs, interaction_label = FALSE)

There are several aesthetic arguments that can be passed to the make_diagram function. All the options are demonstrated below, generating an unpleasing diagram.

diagram_plot <- make_diagram(df_list = diagram_dfs, 
                             node_outline_color = "red",
                             node_fill_color = c("red","blue"),
                             node_text_color = c("blue", "red"),
                             node_text_size = 3,
                             flow_text_color = "green",
                             flow_text_size = 20,
                             main_arrow_color = "brown",
                             main_arrow_linetype = 4,
                             main_arrow_size = 2,
                             interaction_arrow_color = "purple",
                             interaction_arrow_linetype = 3,
                             interaction_arrow_size = 1)
diagram_plot

A better use of the aesthetics can result in pleasing diagrams.

diagram_plot <- make_diagram(df_list = diagram_dfs, 
                             node_outline_color = NA,
                             node_fill_color = c("#6aa4c8", "#eb5600", "#1a9988"),
                             node_text_color = "white",
                             node_text_size = 10,
                             flow_text_color = "black",
                             flow_text_size = 5,
                             main_arrow_color = "grey25",
                             main_arrow_linetype = 1,
                             main_arrow_size = 0.7,
                             interaction_arrow_color = "grey25",
                             interaction_arrow_linetype = 2,
                             interaction_arrow_size = 0.7)
diagram_plot

It is also possible to generate a simple diagram with no flow labels (label_flows = FALSE):

diagram_plot <- make_diagram(df_list = diagram_dfs, 
                             interaction_label = FALSE,
                             label_flows = FALSE,
                             node_outline_color = NA,
                             node_fill_color = c("#6aa4c8", "#eb5600", "#1a9988"),
                             node_text_color = "white",
                             node_text_size = 10,
                             flow_text_color = "black",
                             flow_text_size = 5,
                             main_arrow_color = "grey25",
                             main_arrow_linetype = 1,
                             main_arrow_size = 0.7,
                             interaction_arrow_color = "grey25",
                             interaction_arrow_linetype = 2,
                             interaction_arrow_size = 0.7)
diagram_plot

Retrieve the ggplot2 code

In some cases, the user may want to work directly with the ggplot2 code to make custom changes. The write_diagram_code function does this. The function has no arguments because the ggplot2 is generic given the data frames input list.

my_dir <- "."
my_file <- "diagram_ggplot2_code.R"
write_ggplot2_code(directory = my_dir, filename = my_file)

file.remove("./diagram_ggplot2_code.R")

More examples

SIR with demography and waning immunity

Here is an example using a more complicated model, an SIR model with waning immunity. This model includes demography (births and deaths) and cross-node connections (waning goes from the recovered class to the susceptible class). The model is stored as mbseird and comes with the package.

load('mbsird.rda')
inputs <- make_diagram_inputs(mbmodel = mbsird)
dfs <- make_dataframes(inputs)
fig <- make_diagram(dfs)
fig

The diagram above shows all of the demogaphic flows (e.g., $m*S$ and $n$), which the user may not want to focus on. The external_flows argument determines whether the flows into and out of the system are shown (external_flows = TRUE, default) or not (external_flows = FALSE). Below is the same diagram without the external flows.

inputs <- make_diagram_inputs(mbmodel = mbsird)
dfs <- make_dataframes(inputs)
make_diagram(dfs, external_flows = FALSE)

SIR stratified by two hosts -- WORK IN PROGRESS

Here we can see how the make_diagram function recycles colors for nodes. We have six nodes but only supply three colors.

load('mbsirtwohost.rda')
inputs <- make_diagram_inputs(mbmodel = mbsirtwohost)
dfs <- make_dataframes(inputs)
fig <- make_diagram(dfs,
                    node_fill_color = c("#6aa4c8", "#eb5600", "#1a9988"),
                    node_text_color = "white")
fig