In albgarre/biorisk: What the Package Does (One Line, Title Case)

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(biorisk)
library(tidyverse)

The idea of this document is to give a short introduction to biorisk. This package is to facilitate the implementation, simulation and analysis of QMRA models based on Monte Carlo simulations. It is currently in pre-alpha. This means that the overall structure and philosophy is the package is well-defined, but we still need to refine some features and add additional ones. This document provides a general overview of how models are built, simulated and analysed/visualized.

Building QMRA models with biorisk: the element

In biorisk, everything is a element. You want to define a normal distribution? You use a element. You want a linear inactivation model? You use a element. You want a constant value? You use a element. You want a growth element with correlation between storage time and temperature? You guessed it right; you need a element! This has two disadvantages. The first one is that, if you just want to fix some value to 2 in your model, you do not write

x <- 2

You have to write

x <- Constant$new("x", 2)

which is a bit more convoluted. But the advantage of this element-based philosophy is that everything has the same complexity. If you want to define a log-linear inactivation model, you write

inactivation <- LogLinInactivation$new("inactivation")

Or a secondary growth model

mu <- Ratkowsky_model$new("mu")

A dose-response model

dr <- DoseResponse_BetaPoisson$new("dose-response")

Or a distribution

x <- GammaPoisson$new("something")

So, regardless of what you define (distributions, microbial processes, constants, dose-response models...), you will be using elements. This means that learning how to work with biorisk means learning how to work with elements. You may have caught the hint that biorisk uses Object Oriented programming. Namely, it uses R6 classes. You can still use the packaging without understanding much of OO programming, as biorisk does all the heavy lifting on the background. Nonetheless, a quick look at the documentation of R6 (https://r6.r-lib.org) will help you understand some of the "odd" things the package does (such as the $new()).

Elements in biorisk have two parts. The first one comprises the mathematical calculations (e.g. solving an equation or getting a random value according to some distribution). This is done inside the element and you don't have to worry about it. Your only worry is choosing the right element. This is done by writing the name of the element followed "$new" (the name of every element is given at the end of the document). Then, adding within quotes (") the name assigned to the element (this is just used for output results, so it is not relevant for the calculations). Moreover, due to the way R works, it is often a good idea to assign elements to a variable. We do that writing the name of the variable followed by <-.

This may sound complicated, but it is quite simple. So, we are going to create a element describing a normal distribution. We are going to call this element "logN0" and we are going to assign it to a variable named logN0 (names do not have to match, but at this point it may be easier to understand if they do).

logN0 <- Normal$new("logN0")

And that's a normal distribution!

The second part of the element are its input/outputs. Every calculations in QMRA requires a number of model parameters (e.g. the D-value for microbial inactivation) and variables describing environmental/logistic factors (e.g. the temperature). In a similar way, probability distributions also need some parameters (e.g. the mean and variance of a normal distribution). All of these are called inputs in biorisk.

Depending on the type of calculation, each element has different inputs. To know their names, you just have to type $inputs after the name of a variable storing a element. So, because our normal distribution is called logN0...

logN0$inputs

This means that the element describing normal distribution needs a mean value (mu) and a standard deviation (sigma). One key aspect of biorisk is that elements are assigned to inputs of other elements. This gives a lot of flexibility because the input elements can be practically anything: a probability distribution, a growth model, a dose-response model... or a constant. Indeed, regardless of the complexity of the QMRA model, we will always have a constant on top of the chain of dependencies (e.g., constant parameters for a distribution).

In this first example, we are going two define two constants for our normal distribution: one for the mean and one for the standard deviation. They are defined in a very similar way as the normal distribution (they are also a element!). We write the name of the element (Constant) plus $new and, finally, a name within brackets. The only difference is that, for constants, we need to assign a value by putting a coma after the name.

So, we are going to define a constant value of 0 for the expected value

mu_logN0 <- Constant$new("mu_logN0", 0)

And a constant value of 0.5 for its standard deviation

sigma_logN0 <- Constant$new("sigma_logN0", .5)

So, right now we have defined a element with a normal distribution, a element for its (constant) mean and a third element for its (constant) standard deviation. However, right now, they are totally independent. So, in order to do the calculations, we need to assign the constant elements to the inputs of the normal distribution. We do that using the $map_input method. I think it is easier to show it directly with an example...

This bit of code:

logN0$map_input("mu", mu_logN0)

maps the input named "mu" of element logN0 (our normal distribution) to another element saved as mu_logN0 (the constant we defined before with value 0).

Regardless of the type of elements or inputs, we will always use the $map_input method for mappings. Simple enough, right? So, to map the input named "sigma" to our constant 0.5 we do

logN0$map_input("sigma", sigma_logN0)

Because models are often hard to see just with code, biorisk includes several functions to make life easier. For instance, plot_model() plots the structure of the elements. So, if we pass our logN0 to this function

plot_model(logN0)

we can see how the elements are mapped. In this case, our element called "logN0" depends on elements "mu_logN0" and "sigma_logN0". In this plot, the shapes and colours show the type of element. A grey triangle is a constant, whereas a blue circle is a distribution. There are other shapes included in the package (e.g. for inactivation, growth, dose-response...) -- in the near future, I will also add the name of the input--.

Code shortcuts (advanced)

The biorisk package is implemented in a way that allows several shortcuts. For instance, it allows method chaining, a coding style where you can chain operations. This means you can include the mapping right after the definition of the element. In other words, this bit of code:

mu_logN0 <- Constant$new("mu_logN0", 0)
sigma_logN0 <- Constant$new("sigma_logN0", .5)

logN0 <- Normal$new("logN0")$
  map_input("mu", mu_logN0)$
  map_input("sigma", sigma_logN0)

Is equivalent to what we wrote before

mu_logN0 <- Constant$new("mu_logN0", 0)
sigma_logN0 <- Constant$new("sigma_logN0", .5)

logN0 <- Normal$new("logN0")
logN0$map_input("mu", mu_logN0)
logN0$map_input("sigma", sigma_logN0)

The second shortcut is that you do not need to assign every element to a variable. Instead, they can be defined directly within $map_input() This is quite useful for constant because they are quite uninteresting (they do not have dependencies, they do not have distributions...). So, this bit of code is also equivalent to the previous ones

logN0 <- Normal$new("logN0")$
  map_input("mu", 
             Constant$new("mu_logN0", 0)
            )$
  map_input("sigma", 
            Constant$new("sigma_logN0", .5)
            )

Nonetheless, if you are adamantly against method chaining, you can use the package without it (although the code will be longer and harder to read).

A very simple QMRA model

Normal distributions are nice but boring. So, let's illustrate the package with something a bit more exciting: a (very) simple QMRA model. First, we are going to use a log-linear inactivation model. How do we define it? Just as before, using the name defined within the package (LogLinInactivation) followed $new and a name we assign within quotations

inact <- LogLinInactivation$new("Inactivation")

And how do we check its inputs? By writing $inputs after the variable name (note that the variable name inact is now different than the element name "Inactivation")

inact$inputs

So, we need 3 inputs: the treatment time (t), the D-value (D) and the initial count (logN0).

To logN0, we will assign the normal distribution we defined above.

inact$map_input("logN0", logN0)

To the treatment time, we will assign a constant value of 30.

inact$map_input("t", 
                Constant$new("treat_time", 30)
                )

And to the D-value we are going to assign a log-normal distribution. The good news here is that biorisk includes a LogNormal element:

D <- LogNormal$new("D")$
  map_input("mu_log10", Constant$new("mu_logD", 1))$
  map_input("sigma_log10", Constant$new("sigma_logD", 0.2))

And now we assign this distribution to the inact element.

inact$map_input("D", D)

That was a lot to take in one go. That's why, it is often a good idea to call plot_model() to see what we just did:

plot_model(inact)

Note that plot_model() uses the name of the element, not the name of the variable where the element is saved. For instance, the inactivation element is saved in a variable called inact, but its name is "Inactivation" (inact <- LogLinInactivation$new("Inactivation")). Remember that the name within quotation is the one biporisk uses for every output. It is often a good idea that every element within a model have unique names (e.g., not having 2 elements named "time") because some plotting functions will not work correctly.

Model simulations

Once we have built the model, doing MC simulations is really simple. We just need to call the $simulate() method with the number of Monte Carlo simulations. Just like this

set.seed(1241)
inact$simulate(1000)

and biorisk takes care of doing 1,000 simulations of the element and all its dependencies. By default, the method does not return anything. Instead, the results of each MC iteration are stored within each element

inact$simulations %>% head()

Note that biorisk takes care of doing the calculations for all the dependencies of the element. For instance, it updates the simulations for the distribution of the D-value:

D$simulations %>% head()

Also note that biorisk only does the calculation for the upstream dependencies. This makes sense from a computational point of view, because it allows analyzing parts for the model separately. For instance, one can simulate the distribution of the D-value without having to analyze the complete model.

Result visualization

Looking at the results of individual MC simulations is not very useful For that reason, biorisk includes several tools to visualize and summarize the results.

Output of individual elements

Every element has a $histogram(), $boxplot() and $density_plot() element that allows checking its output

inact$histogram()

D$histogram()

logN0$density_plot()

Outputs of several elements

However, in most situations, we are not just intersted in the output of a single element. For that reason, plot_outputs() compares the distribution of the output variable of every element in the QMRA model

plot_outputs(inact)

Nonetheless, most of these outputs have different units. For that reason, the function includes the chosen argument that allows choosing what to plot.

plot_outputs(inact, chosen = c("logN0", "Inactivation"))

Besides boxplots, the type argument allows doing violin plots

plot_outputs(inact, chosen = c("logN0", "Inactivation"), type = "violin")

or density plots

plot_outputs(inact, chosen = c("logN0", "Inactivation"), type = "density")

Quantile tables

Although plots can help us understand the output of a QMRA model, we often need also numeric values. For that reason, we can use quantile_table() to create a table with the quantiles of the outputs of every element

quantile_table(inact)

Again, you can specify what outputs to show in the table

quantile_table(inact, chosen = c("logN0", "Inactivation") )

By default, the function plots the maximum and minimum value, the median and the 25th and 75th percentile. You can modify this using the probs argument. For instance, to look at the median, 90th and 99th percentile, we pass probs = c(.5, .9, .99):

quantile_table(inact, chosen = c("logN0", "Inactivation"),
               probs = c(.5, .9, .99))

Sensitivity analysis (to be improved)

The package also includes tornado plots for sensitivity analysis with the tornado_plot() function.

tornado_plot(inact)

Again, it allows choosing what element is included in the calculation (not too useful in this example because there are not enough levels)

tornado_plot(inact, chosen = "logN0")

2D Monte Carlo

Defining and simulating 2D Monte Carlo models

In biorisk, we have also included functions for doing 2D Monte Carlo analysis. In order to define a 2D model, the only difference is that we need to pass the level argument to the elements describing distributions. As a very simple example, this is a model with 2 normal distributions, one on the variability level (level 0) and another one in the uncertainty level (level 1):

bb <- Normal$new("mu_mu", level = 1)$
  map_input("mu", Constant$new("mu_mu_mu", 3))$
  map_input("sigma", Constant$new("mu_sigma_mu", .1))

aa <- Normal$new("aa", level =  0)$
  map_input("mu", bb
  )$
  map_input("sigma", Constant$new("sigma", .2))

plot_model(aa)

Now, to do 2-d Monte Carlo, instead of using the $simulate() method, we use the simulate_2D() method. The only difference is that, instead of passing a single number, we need to pass 2: the number of simulations in the variability level (50 here) and the number of iterations in the uncertainty level (10 here)

aa$simulate_2D(50, 10)

Again, the results of the simulation are stored within the elements themselves. They are stored a list where each element is a simulation in the uncertainty level. Then, each element has a table where rows are the iterations in the variability level.

To illustrate this, let's look at the results of our simulations. The distribution bb is defined in the uncertainty level. This means that within each variability simulation (each element in the list), it remains constant (i.e., every row is the same). However, it still varies between elements of the list.

bb$simulations_multi[1:3] %>% map(head)

On the other hand, aa is defined in the variability level, so it varies both between elements of the list and between rows. Note here that the mu input of aa matches the output of bb.

aa$simulations_multi[1:3] %>% map(head)

Visualizing the output of 2D Monte Carlo models

The biorisk package also includes functions specific for the visualization of the output of 2D Monte Carlo models. They can be visualized as a density plot, where the steel blue area is the variation in the variability level, and the grey one the total variation (variability + uncertainty):

aa$density_plot_2D()

It can also be visualized as a cumulative density plot, where the black solid line is the variation in the variability level and the shaded are the additional one due to uncertainty:

aa$cummulative_plot_2D()

Finally, the results can also be shows as a table of quantiles:

quantile_table_2D(aa,
                  probs = c(.1, .9),
                  chosen = c("aa", "mu_mu"))

In this case, for each quantile selected (in probs), the function returns two values: the quantile on the uncertainty level (level 0) and the one on the variability level (level 1). Note that, because element bb ("mu_mu") is defined on the uncertainty level, the quantiles for both levels are the same.

List of elements (incomplete)

Probability distributions

Beta
BetaBinomial
Binomial
EmpiricalDistr
Exponential
Gamma
GammaPoisson
LnNormal
LogNormal
Normal
Pareto
Pert
Poisson
Triangular
TriangularSym
Weibull
ZIPoisson
Uniform

Primary models

ExponentialGrowth
ExponentialGrowthNmax
LagExponentialGrowth)
LogLinInactivation
PelegInactivation
WeibullInactivation
TrilinearGrowth
TrilinearInactivation

Secondary models

Dz_model
Peleg_secondary
Ratkowsky_model
Ratkowsky_model_error

Dose-response models

DoseResponse_BetaPoisson
DoseResponse_Exponential

Dynamic elements

DynamicBigelow_1phase
DynamicBigelow_2phase
DynamicBigelow_3phase

Multivariate elements

CorrelatedExpGrowth

Algebra

MaxFilter
MinFilter
MinMaxFilter
ElementDivision)
ElementEqualTo)
ElementExp)
ElementGreaterEqualThan)
ElementGreaterThan)
ElementLn)
ElementLog)
ElementLowerEqualThan)
ElementLowerThan)
ElementMinus)
ElementPlus)
ElementPow10)
ElementPower)
ElementSqrt)
ElementTimes)

albgarre/biorisk documentation built on Feb. 23, 2025, 5:19 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

albgarre/biorisk
What the Package Does (One Line, Title Case)

In albgarre/biorisk: What the Package Does (One Line, Title Case)

Building QMRA models with biorisk: the element

Code shortcuts (advanced)

A very simple QMRA model

Model simulations

Result visualization

Output of individual elements

Outputs of several elements

Quantile tables

Sensitivity analysis (to be improved)

2D Monte Carlo

Defining and simulating 2D Monte Carlo models

Visualizing the output of 2D Monte Carlo models

List of elements (incomplete)

Probability distributions

Primary models

Secondary models

Dose-response models

Dynamic elements

Multivariate elements

Algebra

R Package Documentation

Browse R Packages

We want your feedback!

albgarre/biorisk What the Package Does (One Line, Title Case)

In albgarre/biorisk: What the Package Does (One Line, Title Case)

Building QMRA models with biorisk: the element

Code shortcuts (advanced)

A very simple QMRA model

Model simulations

Result visualization

Output of individual elements

Outputs of several elements

Quantile tables

Sensitivity analysis (to be improved)

2D Monte Carlo

Defining and simulating 2D Monte Carlo models

Visualizing the output of 2D Monte Carlo models

List of elements (incomplete)

Probability distributions

Primary models

Secondary models

Dose-response models

Dynamic elements

Multivariate elements

Algebra

R Package Documentation

Browse R Packages

We want your feedback!

albgarre/biorisk
What the Package Does (One Line, Title Case)