Basic SIR Model Documentation

#load various variable definitions that are the same for each app
source('startup_script.R')
currentrmdfile = knitr::current_input() 
#currentrmdfile = "basicsir_documentation.Rmd" # For debugging
appsettings = get_settings(currentrmdfile,appdocdir,packagename)

Overview {#shinytab1}

This app implements the basic SIR (susceptible-infected-recovered) model and allows you to explore a very basic infectious disease simulation. The main goal is to provide familiarity with the overall setup and ideas behind using these simulations, and how to run them. Read about the model in the "Model" tab. Make sure to read the 'general notes' section. Then do the tasks described in the "What to do" tab. Finally, check out the "Further Information" tab to learn where you can find some background information on this (and many of the other) apps.

Learning Objectives

The Model {#shinytab2}

Model Overview

This model is a compartmental SIR (susceptible-infected-recovered) model. Compartmental means that we place individuals into distinct compartments, according to some characteristics. We then only track the total number of individuals in each of these compartments. In the simplest model, the only characteristic we track is a person's infection status. We allow for 3 different stages/compartments:

The SIR model is very basic. It could be extended by introducing further compartments. For instance, we could stratify according to gender, which would give us 2 sets of SIR compartments, one for males and one for females. Some of these extensions are implemented in other apps.

In addition to specifying the compartments of a model, we need to specify the processes/mechanisms determining the changes for each compartment. Broadly speaking, there are processes that increase the number of individuals in a given compartment/stage, and processes that lead to a reduction. Those processes are sometimes called inflows and outflows.

For our system, we specify only 2 processes/flows:

As with the compartments, we could extend the model and allow other processes to occur. For instance, we could allow for natural births and deaths, waning immunity, deaths due to disease, etc. Some of that will be included in other apps.

Model Representation

For compartmental models (and also often other types of models), it is useful to show a graphical schematic representation of the compartments and processes included in the model. For compartmental models, such a diagram/figure is usually called a flow diagram. Such a diagram consists of a box for each compartment, and arrows pointing in and out of boxes to describe flows and interactions. For the simple SIR model, the flow diagram looks as follows:

knitr::include_graphics( system.file(figuredir,appsettings$modelfigname,package=packagename))

Model Implementation I

To allow us to simulate this model, we need to implement it on the computer. For that purpose, it is often useful to write the model as mathematical equations (this is not strictly needed, some computer simulation models are never formulated as mathematical models). A very common way (but not the only one) to implement compartmental models such as the simple SIR model is a set of ordinary differential equations. Each compartment/variable gets an equation. The right side of each equation specifies the processes going on in the system and how they change the numbers in each compartment via inflows and outflows. For the model described above, the equations look like this:

$$ \begin{aligned} \dot S & = - b S I \ \dot I & = b S I - g I \ \dot R & = g I \end{aligned} $$

Note: If you don't see equations but instead gibberish, try opening the app with a different browser. I have found that occasionally, on some computers/browsers, the math is not shown properly.

Model Implementation II

Continuous time models implemented as ordinary differential equations are the most common types of models. However, other implementations of the above model are possible. One alternative formulation is a discrete-time deterministic equivalent to the ODE model. For such an implementation, the equations are:

$$ \begin{aligned} S_{t+dt} & = S_t + dt * \left( - b I_t S_t \right) \ I_{t+dt} & = I_t + dt * \left( b I_t S_t - g I_t \right) \ R_{t+dt} & = R_t + dt * \left( g I_t \right) \end{aligned} $$

In words, the number of susceptible/infected/recovered at a time step dt in the future is given by the number at the current time, t, plus/minus the various inflow and outflow processes. The latter need to be multiplied by the time step, since less of these events can happen if the time step is smaller. As the time-step gets small, this discrete-time model approximates the continuous-time model above. In fact, when we implement a continuous-time model on a computer, the underlying simulator runs a "smart" version of a discrete-time model and makes sure the steps taken are so small that the numerical simulation is a good approximation of the continuous-time model. If you want to learn more about that, you can check out the 'deSolve' R package documentation, which we use to run our simulations.

Some general notes

What to do {#shinytab3}

A few general notes

The tasks below are described in a way that assumes everything is in units of days (rate parameters, therefore, have units of inverse days). If any quantity is not given in those units, you need to convert it first (e.g. if it says a week, you need to convert it to 7 days).

# save all tasks, outcomes, etc. into an R data frame, then print later.
# this data frame is used to automate shiny solutions and quiz generation

# Explanation for each of the columns in the R data frame 

# QuizID: Used by the grading app. naming structure "dsaide_appid" 
# AppTitle: Title used for the app in the dsaide app, from appsettings$apptitle
# AppID: App ID (short version of title), from appsettings$appid
# TaskID: Number for each task the text belongs to 
# TaskText: The text that explains what to do for the task
# RecordID: Identifies if it is the first, second,... item to record, labeling is T1R1, T1R2, etc.
# Record: Text explaining what value from the model that nee too be recorded
# Type: type of entry: 
#     Character (single letter - for multiple choice), 
#     Text (any text - should be avoided, hard to match/grade),
#     Numeric
#     Rounded_Integer
#     Rounded_Numeric
#     Integer
#     Logical - could be Yes/No or True/False. Will be evaluated the same in grading app.
#     None - if nothing is to be reported for a given task
# Note: Used by students taking quiz. Makes it clear what type of value to enter in "Answers"

#this is the running counter for the records which starts at 1 
rc=1

#empty object, will hold all outcomes
alloutcomes = NULL

# Task 1
tid = 1
tasktext = "Start with 1000 susceptible individuals and 1 initially infected host, no recovered. Set simulation duration to 100 days, start time and time step can remain at 0 and 0.1. Set recovery rate _g_=0.5 per day, and infectiousness _b_=0.001. You will get an outbreak of some - currently unspecified - infectious disease. If you did it correctly, your outbreak should end with around 203 uninfected individuals still remaining. Also take a look at the final number of infected. You'll see that it is at _3E-9_, i.e. much less than 1 but not 0. That's the issue discussed above in the notes, that for ODE models, numbers can drop below 1, and then can slowly go to 0 but not fully reach 0. Often, this biologically unreasonable feature of these kinds of models can be ignored, but sometimes it matters. You will learn more about this when you work through the stochastic model apps."

# Record for task 1
nrec = 1 # number of items to record
out_records = c("Total number of recovered at end of simulation")
out_types = rep("Rounded_Integer",nrec)
out_notes = rep("Report the rounded integer",nrec)
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec #increment record counter by number of outcomes to record for this task 




# Task 2
tid = 2
tasktext = "Switch the plotting to have x-axis, y-axis or both plotted on a log scale. Leave all other settings as before. Note that while the look of the plot changes a lot, nothing has changed about the underlying simulation. The results are exactly the same in each case, only plotted differently. This is something to be aware of when you see plots in papers or produce your own. The best plot to use is the one that shows results of interest in the clearest form. Usually, the x-axis is linear and the y-axis is either linear or logarithmic. Try again switching between ggplot and plotly as plotting engine. Play around a bit with plotly. It has a number of interactive features, such as zooming, reading off values from a graph, turning on/off specific lines, etc. Those features will likely be useful as you go through the different apps in DSAIDE, thus I encourage you to explore the plotly functionality somewhat."

# Record for task 2
nrec = 1 # number of items to record
out_records = c("Nothing")
out_types = rep("None",nrec)
out_notes = c("")
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec #increment record counter by number of outcomes to record for this task 



### Task 3 
tid = 3
tasktext = "From the graph, get a (rough) estimate of the day at which the outbreak peaks. That's easiest done when you make a plotly plot. Contemplate the fact that the outbreak ends even though there are still susceptible remaining, i.e. not everyone got infected. Do you find that surprising? How could you maybe explain that? This topic is discussed in more detail in the reproductive number app. 

You have no infected left at the end of the outbreak. Figure out how you can use the information from either the susceptibles or recovered left at the end of the outbreak to figure out how many people in total got infected during the outbreak. To that end, think about how individuals move through the different compartments in this model. What can happen to those in the S/I/R compartments as the outbreak progresses? Where are they at the end of the outbreak? Use either information about the number of individuals in S or R at the beginning and end of the outbreak to determine the total (cumulative) number of individuals that got infected during the outbreak.

Rerun the simulation, with the same values you just had. Does anything change? Why (not)?"

# Record for task 3
nrec = 2 # number of items to record
out_records = c("Day the outbreak peaked",
                    "Total/cumulative number infected  (compute as described in text)")
out_types = rep("Rounded_Integer",nrec)
out_notes = rep("Report the rounded integer",nrec)
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec


### Task 4
tid = 4
tasktext = "Set _Models to run_ to _both_, which runs and shows both the continuous-time and discrete-time models. Start with a discrete-time step of 0.01. Leave all other settings as before. Run the simulation, see what you get. You should see the results from the 2 models essentially on top of each other and barely distinguishable."

# Record for task 4
nrec = 2 # number of items to record
out_records = c("Number of susceptible at end of simulation for continuous-time model",
                    "Number of susceptible at end of simulation for discrete-time model")
out_types = rep("Rounded_Integer",nrec)
out_notes = rep("Report the rounded integer",nrec)
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec


### Task 5
tid = 5
tasktext = "Now try different values for _dt_. Leave all other settings as before. You should notice that as _dt_ gets larger, the differences between discrete-time and continuous-time models increase. At some point when the time-step gets too large, the discrete-time simulation 'crashes' and you get an error message. This doesn't impact the continuous/ODE simulation since it chooses its time-step during the simulation internally and _dt_ only affects the times for which the results are returned."

# Record for task 5
nrec = 2 # number of items to record
out_records = c("Number of susceptible left at end of simulation for continuous-time model dt=1",
                    "Number of susceptible left at end of simulation for discrete-time model dt=1")
out_types = rep("Rounded_Integer",nrec)
out_notes = rep("Report the rounded integer",nrec)
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec


### Task 6
tid = 6
tasktext = "Go back to running the ODE model only. Set all parameters as described in task 1. Then increase the rate of recovery by a factor of four, i.e. set it to g=2. (What duration of the infectious period does that translate to?) Contemplate what you expect to find for the increased recovery rate before you run the simulation. Then run the simulation. Compare your expectations with the results. How do they agree/disagree? Does it make sense? Anything surprising happening?"

# Record for task 6
nrec = 3 # number of items to record
out_records = c("Number of susceptible at end of simulation",
                    "Number of recovered at end of simulation",
                    "Do you see an outbreak? (Yes/No)")
out_types = c("Rounded_Integer",
                    "Rounded_Integer",
                    "Logical")
out_notes = c("Report the rounded integer",
                    "Report the rounded integer",
                    "Report Yes or No")
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec



### Task 7
tid = 7 
tasktext = "Increase the transmission rate by a factor of 4, leave everything else as previously. How do you expect the results to change? (Try to make your prediction as precise/quantitative as you can.) Run the simulation with these new parameter settings. Compare your expectations with the results. How do they agree/disagree? Does it make sense? Compare the values for number susceptible/recovered at end of simulation and the peak of the outbreak to the values you found in the initial tasks. Hint: You should see that the _magnitude_ of the outbreak is essentially the same, but the _timing_ has changed."

# Record for task 7
nrec = 4 # number of items to record
out_records = c("Number of susceptible at end of simulation",
                    "Number of recovered at end of simulation",
                    "Do you see an outbreak? (Yes/No)",
                    "Day at which outbreak peaked")
out_types = c("Rounded_Integer",
                    "Rounded_Integer",
                    "Logical",
                    "Rounded_Integer")
out_notes = c("Report the rounded integer",
                    "Report the rounded integer",
                    "Report Yes or No",
              "Report the rounded integer")
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec


### Task 8 
tid = 8
tasktext = "Double the number of susceptibles, _S~0~_, leave everything else as you just had. How do you expect the results to change? Run the simulation with these new parameter settings. Compare your expectations with the results. How do they agree/disagree? Does it make sense? Anything surprising happening?"

# Record for task 8
nrec =4 # number of items to record
out_records = c("Number of susceptible at end of simulation",
                    "Number of recovered at end of simulation",
                    "Do you see an outbreak? (Yes/No)",
                    "Day at which outbreak peaked")
out_types = c("Rounded_Integer",
                    "Rounded_Integer",
                    "Logical",
                    "Rounded_Integer")
out_notes = c("Report the rounded integer",
                    "Report the rounded integer",
                    "Report Yes or No",
              "Report the rounded integer")
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec


# Task 9
tid = 9
tasktext = "Keep playing around with changing any of the parameters and starting conditions. Every time, think about what you expect to get, then run the simulation, compare your expectations with the results. Then make sense of it. What is the minimum and maximum number of outbreaks you can get? Why is that?"

# Record for task 9
nrec = 2 # number of items to record
out_records = c("Minimum number of outbreaks you can get",
                    "Maximum number of outbreaks you can get")
out_types = rep("Integer",nrec)
out_notes = rep("Report an integer",nrec)
outcomes = data.frame( TaskID = rep(tid,nrec),
                       TaskText = rep(tasktext,nrec),
                      RecordID = paste0('T',tid,'R',(1:nrec)),
                      Record = out_records, 
                      Type = out_types, 
                      Note = out_notes)
alloutcomes = rbind(alloutcomes,outcomes)
rc = rc + nrec
#save the fully filled task table to a tsv file
alloutcomes$QuizID = paste0(packagename,"_",appsettings$appid)
alloutcomes$AppTitle = appsettings$apptitle
alloutcomes$AppID = appsettings$appid
#remove a few variables from the data frame
savedoutcomes <- dplyr::select(alloutcomes,QuizID,AppID,AppTitle,TaskID,TaskText,RecordID,Record,Type,Note)     
write.table(savedoutcomes, paste0(appsettings$appid,"_tasktable.tsv"), append = FALSE, sep = "\t", row.names = F, col.names = TRUE)
# Take all the text stored in the table and print the tasks and items to record
write_tasktext(alloutcomes)

Further Information {#shinytab4}

References



Try the DSAIDE package in your browser

Any scripts or data that you put into this service are public.

DSAIDE documentation built on Aug. 24, 2023, 1:07 a.m.