knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4 )
library(tidyverse) library(biorisk)
Chlorine is widely used as a disinfectant in the food industry to inactivate microorganisms, for example in the washing of fruits and vegetables or in drinking water treatment processes. In this case study, a simplified QMRA model is proposed to estimate the risk of [add the micro] due to exposure through ingestion of a contaminated lettuce [check]. The model includes every step of a risk assessment, from the initial concentration to the estimated number of cases. It is implemented as 2D Monte Carlo model.
The model is developed only with the purpose of illustrating the capabilities of biogrowth. For that reason, most hypotheses (i.e., parameter values for the distributions) are not justified, so the model should be used with educational purposes.
We define the initial concentration of microorganisms in the product as a uniform distribution over 0-1 log CFU/g (Uniform
element). As an illustration, it is defined on the variability level (level = 0
).
logN0 <- Uniform$new("logN0", # A uniform distribution level = 0)$ map_input("min", Constant$new("logN0_min", 0) # with a constant min. value )$ map_input("max", Constant$new("logN0_max", 1) # and a constant max. value )
We use the next model for the inactivation of chlorine. It is log-linear primary inactivation model [add reference to Aricia]
$$ logN = logN_0 - k \cdot t $$ where the inactivation rate ($k$) has a linear relationship with the chlorine concentration ($C$).
$$ k = -0.1811 + 0.0069 \cdot C $$ To implement the model in biorisk, we first need to define the secondary model. For this, we first need to define a model that defines the chlorine concentration. In this case, we will use a constant concentration of 50 ppm.
chlorine <- Constant$new("chlorine", 50)
Next, we need to define a element for the model equation. The package includes several elements that describe simple model equations commonly used in the literature. In this case, we need a LinealModel
. We will define a fixed intercept (-0.1811
), whereas the slope will be defined as a normal distribution. For no particular reason, we assume this distribution represents variability only (level = 1
):
k_slope <- Normal$new("k_slope", level = 1)$ map_input("mu", Constant$new("k_slope_mu", 0.0069))$ map_input("sigma", Constant$new("k_slope_sigma", .0001)) k_chlorine <- LinealModel$new("k_chlorine")$ map_input("a", Constant$new("k_intercept", -0.1811) )$ map_input("b", k_slope)$ map_input("x", chlorine)
Once we have the secondary model, we can define the primary model. Log-linear inactivation parameterized based on the inactivation rate ($k$) is named LogLinInactivation_k
in biorisk. Besides the initial concentration ("logN0"
) and the inactivation rate ("k"
), this element needs the treatment time. We define a constant value (20 s) for it:
t_chlorine <- Constant$new("t_chlorine", 20) # the treatment time (s) inactivation <- LogLinInactivation_k$new("Treatment")$ map_input("logN0", logN0)$ # we map logN0 to the initial count defined before map_input("k", k_chlorine)$ map_input("t", t_chlorine)
We will describe microbial growth during storage using the exponential growth model with stationary phase (bilinear model):
$$ logN = logN_0 + \mu \cdot t \ ; \ if \ logN_0 + \mu \cdot t < logN_{max} \ logN = logN_{max} \ ; \ otherwise $$
with the growth rate given by the Ratkowsky model:
$$ \mu = 0.014(T - 1.6)^2 $$
As above, we first need to define the secondary model. Luckily, biorisk includes the Ratkowsky_model
element that implements this model. It has several inputs (treatment temperature, $T_{min}$ and $b$). For the temperature, we assign a normal distribution that represents uncertainty (level = 1
) with expected value 6.35
and standard deviation 2.83
.
temp_distrib <- Normal$new("temp_distrib", level = 1)$ # a normal distribution map_input("mu", Constant$new("mu_temp_distrib", 6.35) # with a constant mean )$ map_input("sigma", Constant$new("sd_temp_distrib", 2.83) # and constant variance )
Then, we define the Ratkowsky model (Ratkowsky_model
), mapping the temperature to the element we just defined. In this case, we assume there is no variability or uncertainty in the model parameters.
mu_distrib <- Ratkowsky_model$new("mu_distrib")$ # the Ratkowsky secondary model map_input("b", Constant$new("b_distrib", 0.0144) # with constant b )$ map_input("Tmin", Constant$new("Tmin_distrib", 1.6) # and constant Tmin )$ map_input("temperature", temp_distrib )
After that, we define the primary growth model. The package implements the ExponentialGrowthNmax
model, which is exactly what we need. This elements needs as input the storage time, for which we assume an exponential distribution with rate parameter 1/29
.
time_distrib <- Exponential$new("time_distrib")$ # an exponential distribution map_input("rate", Constant$new("rate_time_distrib", 1/29) # with constant rate parameter )
Finally, we define the primary growth model (ExponentialGrowthNmax
). This element has several inputs. For the treatment time, we use the exponential distribution we just defined. The growth rate ("mu"
) is mapped to the output of the Ratkowsky model. Then, the initial concentration at storage is mapped to the output of the inactivation model. Finally, we will define a constant $N_{max}$ of 8 log CFU/g.
growth_distrib <- ExponentialGrowthNmax$new("growth_distrib")$ map_input("t", time_distrib # we assign the element from above to define the time )$ map_input("mu", mu_distrib # we assign the element from above to define the growth rate )$ map_input("logN0", inactivation # and logN0 to the output of the chlorine now )$ map_input("logNmax", Constant$new("logNmax", 8) # A constant logNmax )
The next step is to convert the microbial concentration (in log CFU/g) to microbial dose consumed. The package includes the Concentration2Dose
element to help in this calculation. Note that this element considers the fact that the dose is a sampling process (i.e., the output is a discrete number of cells). In order to obtain continuous values for the dose (e.g., a dose of 0.23 cells), one must use Concentration2Dose_continous
.
This element has two inputs, the microbial concentration at exposure and the serving size. For the latter, we use a uniform distribution that represents uncertainty (level = 1
). For the microbial concentration, we map the output of the growth element from above.
serving_size <- Uniform$new("size", level = 1)$ map_input("min", Constant$new("min_size", 200))$ map_input("max", Constant$new("max_size", 500)) consumer_dose <- Concentration2Dose$new("dose")$ map_input("logN", growth_distrib)$ map_input("size", serving_size)
First, we need to define the dose-response model. We use the exponential dose response model with $r = 10^{-12}$, assuming constant pathogen-host survival probability.
Pill <- DoseResponse_Exponential$new("Pill")$ map_input("r", Constant$new("r_dr", 1e-12))$ map_input("dose", consumer_dose)
Then, we can estimate the number of cases. For that, biorisk includes the Pill2Cases_N
element to convert from probability of illness to number of cases. This element considers for each Monte Carlo iteration that the $n_{servings}$ have the same probability of illness. On the other hand, the element Pill2Cases_1
considers a single serving per $P_{ill}$. In this case, we will make the calculations per $10^{12}$ servings.
cases <- Pill2Cases_N$new("cases")$ map_input("Pill", Pill)$ map_input("servings", Constant$new("n_servings", 1e12))
The package includes the $check_input_types
method, which checks whether the dependencies in the model are consistent with respect to variables types. If we run it for this model, we get a warning.
cases$check_input_types(recursive = TRUE)
The message says that the dose-response model used for the calculation of Pill
expects the microbial dose to be a discrete variable. This makes sense, as one can consume 321 or 322 CFU but not 321.1. In the vignette with the second case study, we show how the dose can be calculated as a discrete variable using discrete models. Notheless, the model can still be simulated, as these simplifications are often required in QMRA.
The first thing we should always do is plot the model. This way we check that dependencies are correct.
plot_model(cases)
Remember that we can always zoom-in on some parts of the model. For instance, we can look only at the secondary model for inactivation:
plot_model(k_chlorine)
Although the model has been defined as a 2D Monte Carlo, biorisk also allows simpler simulations without having to modify the model. For instance, we can get an approximate, discrete prediction for the output.
cases$point_estimate()
We can also run the model as a 1D Monte Carlo (i.e., without making a difference between the uncertainty and variability levels) by calling the simulate()
method:
cases$simulate(100000, seed = 241)
We can then visualize the number of cases per $10^{12}$ servings, for instance, as a histogram. Note that we log-transform it due to the heavy tail. This removes the 0's (i.e., simulations without any case). Therefore, it is recommended to take a carefull look at the warning message showing the number of 0's in the simulations.
cases$histogram(add_discrete = TRUE) + scale_x_log10()
By passing add_discrete = TRUE
, the plot includes a dashed line with the discrete prediction. Note that this is quite biased with respect to the histogram. The reasons for this is the use of asymmetric distributions and the nonlinear models.
We can also easily visualize the output of other elements besides the final outcome. For instance, we can observe the microbial concentration at the end of storage as a density plot.
growth_distrib$density_plot(TRUE)
The model output can be visualized as a table of quantiles with the quantile_table()
function.
quantile_table(cases, chosen = c("cases", "Pill", "dose", "growth_distrib"), probs = c(.5, .9, .99))
We can also generate a plot to check the variation of the microbial concentration on each step using plot_outputs()
. Note that, by default, this function plots the output of every element. Considering that this mixes different units, it is recommended to always defined the elements to plot using the chosen
argument.
plot_outputs(cases, chosen = c("logN0", "Treatment", "growth_distrib"), type = "violin")
The biorisk package includes also several functions for sensitivity analysis. The tornado_plot()
function generates a tornado plot. It takes as input the output node for which to make the plot.
tornado_plot(cases)
By default, it uses every node in the model. This can be modified using the chosen
argument, which takes a character vector of node names.
tornado_plot(cases, chosen = c("temp_distrib", "Treatment"))
Moreover, the package includes the regression_tree()
function to make a regression tree of the output. This function also takes as input an output node. Then, it returns a regression tree built using the rpart package. In order to plot the tree, the argument make_plot
must be set to TRUE
.
regression_tree(cases, make_plot = TRUE)
By default, the tree is built using every node in the model. This can be modified using the chosen
argument, which also takes a character vector of node names.
regression_tree(cases, chosen = c("temp_distrib", "time_distrib", "k_chlorine"), make_plot = TRUE)
To run the 2D-MC simulation, we just need to call the simulate_2D()
method. This function takes two arguments, indicating the number of simulations on each level.
cases$simulate_2D(1000, 100, seed = 792)
Again, we can visualize the output of any element. For instance, we can depict the concentration after storage as a density plot. Note that, to represent both uncertainty and variability, we need to call the density_plot_2D()
method.
growth_distrib$density_plot_2D()
In this plot, the distribution from level 0 (variability) is coloured blue, whereas the distribution including all the sources of variation (variability & uncertainty) is depicted in grey. Note that the latter is very similar to the denstiy plot generated above for the 1D Monte Carlo simulation.
The output can also be visualized as a cumulative distribution, where the line represents level 0 and the ribbon the additional variation due to the uncertainty level.
growth_distrib$cummulative_plot_2D()
The results can also be visualized as a table of quantiles with the quantile_table_2D()
function.
quantile_table_2D(cases, chosen = c("Pill", "dose", "growth_distrib"), probs = c(.5, .9, .99))
In this case, the quantiles are calculated both under level 0 (variability) and over the complete model (variability & uncertainty). Note that the quantiles for level 1 are practically identical to those calculated above for the 1D Monte Carlo simulation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.