In jpablo91/QuantRRA: A library for Reproducible Risk Assessment with R

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

In this exercise we will be using the package quantrra for risk assessment. Make sure you have installed the latest version quantrra. The objectives for this lab are:

Familiarize with methods to develop risk assessment using R
Integrate qualitative and quantitative information for the development of RRA
Develop a system that can be easily updated and re-parametrized
Use random forest and regression trees to evaluate the sensitivity of the model
Use the model to evaluate the risk stratified for multiple countries.

# Libraries we will use
library(sf); library(dplyr)
library(quantrra)

Basics

Sampling distributions

By default, R has several functions to sample a distribution and plot the results. For example, if we would like to sample a normal distribution with a mean of 5 and standard deviation of 0.12, we can use the following code:

# sample the distribution and save it to an object
x <- rnorm(n = 100, # Number of observations to sample
           mean = 5, # Mean 
           sd = 0.12) # Standard deviation

# Plot the observations
hist(x)

In this lab we will be using the package quantrra, which is specifically developed for risk assessment in R. The function ra_sample() conveniently wraps functions to sample multiple distributions such as: Normal, binomial, uniform, pert, among others. The function requires two arguments:

x which is a string (text) of the distribution and its parameters.\
n the number of observations we will sample.

n <- 100 # number of observations
d <- 'Normal(5, 0.12)' # Distribution to sample

x <- ra_sample(x = d, n = n) # Function to sample the distribution

# We can use the function ra_plot_dist() from the package quantrra to get a more familiar output:
ra_plot_dist(x, # the values sampled
         main = 'Distribution of x') # A title for our plot

The model file

Now we will create our first model. The main function ra_run() requires two arguments:

m the model file
nsim number of simulations.

Lets see an example of a pre-made model file. The following model was based on a risk assessment performed by the OIRSA (Organismo Internacional Regional de Sanidad Agropecuaria) for the introduction of ASF into the countries that are part of OIRSA, you can read more about this report in this link. The model estimates the number of introduction events from imported animal products swill feeding in a given year.

To represent the model we must identify the inputs and the outputs we want to calculate. The following figure was adapted from the report by OIRSA

Lets have a look of how we would represent this in R. This model was adapted and its contained in the package, to access the model file use the following code:

# lets save the model to a new object:
m1 <- quantrra::OIRSA

# now we will examine the object
m1

The first data.frame (nodes) is what we call a node table, where each row represents a node in our risk assessment tree. The node table contains the following columns:

id, Unique identifier for the parameter. Use short names with no special characters or spaces starting with a letter, i.e. I1, P3, etc...
label Name of the parameter.
type If its an input (In) or output (Out) parameter. Input parameters must have a distribution specified, otherwise NA; Output parameters must have a formula specified, NA otherwise
level Hierarchy level of the parameter, This will determine the order in which the output parameters will be calculated.
distribution Distribution for the input parameters, must follow the specification of the ra_sample() function from quantrra, if the parameter is an output, must be NA.
formula Formula used to calculate the Out parameters, the formula must include other parameters from lower hierarchy levels. The formula con use any operators such as multiplication (*), sum (+), substraction (-), exponential (^), division(/), etc...

The second table in our list is an edge table, which is used for visualization purposes.

We can obtain a visual representation of our model using the function ra_plot_tree(). This function takes a list with named elements for the nodes and edges. Let's give it a try:

ra_plot_tree(m1)

Running the model

Let's run the model now. We use the function ra_run() on the object containing our model. The function only uses the node table, so we will specify to use only that

m1 <- OIRSA
# Run the model 5000 times
mo <- ra_run(m = m1$nodes, nsim = 5000)
# Visualize the results:
ra_plot_dist(mo$P)

The result of this is a table with the distributions sampled for the inputs, and the ones calculated for the outputs. Based on the results, we can estimate the probability that the number of introduction events will be more than 1:

sum(mo$P > 1) / length(mo$P)

Sensitivity analysis

We can perform sensitivity analysis on the model to identify the most relevant parameters and explore the parameters sample space, for this we can use the function ra_gsa() from the quantrra package. The function uses random forest to estimate the relative importance of the parameters, and classification and regression trees to visualize the interactions between the parameters sampled. The function requires 3 main arguments:

data: Our results table from the model.
formula A formula that contains the names of the outcome analyzed and the parameters.
tree: we can specify if we want an interactive or static visualization for our tree.

# First we specify the formula:
f <- P ~ P1 + P2 + P3 + R1 + R2 + H1 + H2
# Then we use the function with our results
sa <- ra_gsa(data = mo, f = f, tree = 'interactive')
# The results contain 3 objects:
sa$VarianceExp # The variance explained by our parameters

sa$RelImport # The relative importance of our parameters

sa$RT # The classification and regression tree

Case study: ASF

Now let's try a different model. We will load a model used in the publication: Quantitative risk assessment of African swine fever introduction into Spain by legal import of swine products. You can see more information of the parameters in the model by looking at the publication. The model is contained in the quantrra library and we can access to it by using:

# save the model to a new object
m2 <- quantrra::asf_products

# examine the model
m2

# Obtain a visual representation of the model
ra_plot_tree(m2)

You will notice that this model has 3 elements inside the list, we already covered the first two elements (nodes and edges). The third element is a stratified table, which is meant to provide different parametrization for each of the importing countries.

Similarly than with the previous example, we can run our model using the function ra_run():

ra_run(m = m2$nodes, nsim = 10e3) %>% # run the model
  pull(Pf) %>% # get the main output
  ra_plot_dist() # plot the distribution

But since we also have a stratified table, we can run the model stratified to estimate the risk of introduction for each of it's partner countries:

# Run the stratified model
mos <- ra_run_strat(m = m2$nodes, tbl = m2$stratified, nsim = 1000)

# The output will be a data frame with the mean and 95 percentiles for each of the outputs form our model:
head(mos)

As you notice, by default the function already computes the mean and 95 percentiles for us. If you want to keep all the results, we can use the argument simplify = F, but we wont do it here.

The function ra_plot_ranking() can be used to represent visually the estimates for each of the strata:

ra_plot_ranking(x = mos, var = "Pf")
ra_plot_ranking(x = mos, var = "Pf", format = "interactive")

mos %>%

w <- quantrra::wm %>% 
  left_join(mos, by = c("name" = "ids")) 

ggplot() +
  geom_sf(data = w, aes(fill = Pf_m), col = "black") +
  theme(
    legend.position = "bottom",
    legend.text = element_text(angle = 25),
    panel.background = element_blank()
  ) +
  scale_fill_gradient(low = "#200000", high = "red")

Using `quantrra` outside R

So far we covered how to use the functions in models that are already defined. One of the objectives of quantrra is to simplify the process of model sharing for reproducibility purposes. There are multiple ways to interact with quantrra outside R, this might be useful to facilitate the risk assessment and parametrization for people that does not feels as confortable using programming languages.

# default export method is in a xlsx
ra_export(m = m2, dir = "model")
# model can also be exported as a zip
ra_export(m = m2, dir = "model", format = "zip")

When model is created outside R, can be imported using the function ra_import() this function automatically identified if a model is in xlsx or zip format

ra_import("model.xlsx") %>% 
  ra_plot_tree()

jpablo91/QuantRRA documentation built on July 3, 2024, 10:46 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jpablo91/QuantRRA
A library for Reproducible Risk Assessment with R

In jpablo91/QuantRRA: A library for Reproducible Risk Assessment with R

Introduction

Basics

Sampling distributions

The model file

Running the model

Sensitivity analysis

Case study: ASF

Using `quantrra` outside R

R Package Documentation

Browse R Packages

We want your feedback!

jpablo91/QuantRRA A library for Reproducible Risk Assessment with R

In jpablo91/QuantRRA: A library for Reproducible Risk Assessment with R

Introduction

Basics

Sampling distributions

The model file

Running the model

Sensitivity analysis

Case study: ASF

Using quantrra outside R

R Package Documentation

Browse R Packages

We want your feedback!

jpablo91/QuantRRA
A library for Reproducible Risk Assessment with R

Using `quantrra` outside R