library(learnr) knitr::opts_chunk$set(error = TRUE) options(knitr.kable.NA = '') set.seed(123)
In this practical we will write some code to run a simple epidemiological model -- the Susceptible-Infected-Susceptible or SIS model. From these results we will report on the transmission of E. coli O157 in cattle.
We have now built a simple population dynamics model, but the framework we are using is capable of handling much more. We will start with the simplest model of infectious disease dynamics -- the SIS model -- that we covered in the epidemiology lecture.
The SIS model requires two populations (of susceptible and infected animals) to replace the single population we have used so far. Susceptibles, $S$, and infecteds, $I$, are updated as follows:
$$S(t+1) = S(t) - \beta \times \frac{S(t) \times I(t)}{N} + \sigma \times I(t)$$ $$I(t+1) = I(t) + \beta \times \frac{S(t) \times I(t)}{N} - \sigma \times I(t)$$
where the total population, $N = S(t) + I(t)$, the sum of the number of susceptibles and infecteds is constant.
As we said earlier, it is a good idea to use descriptive names in your code to
avoid confusion, so we will call the two populations susceptibles
and
infecteds
- these will be the column names in your data frame. The
two epidemiological parameters are $\beta$ (beta, the transmission rate) and
$\sigma$ (sigma, the recovery rate) -- these Greek letters are almost universally
used as the names of basic transmission and recovery rates in epidemiology.
However, unless you're doing a lot of epidemiology, this isn't obvious. In any
event, in our function definition we will give them descriptive names again,
calling them generically transmission.rate
and recovery.rate
in the function
definition, and in the script we will be more specific and call them
ecoli.transmission
and ecoli.recovery
since this is the disease we are
going to be modelling (it also avoids reusing the same variable
names). The reason we're doing it this way round is that the function can
work for any disease model, but the script is specifically for E Coli.
The first thing to do is to create a new GitHub repository and RStudio project
for this practical. Do this following the how to instructions on the GitHub
documentation pages
that you used for the last practical, putting the repository in
the SBOHVM organisation again, as before. Call this
project githubusernameSeries02
(obviously use your real GitHub username!).
Remember that if your GitHub username has dashes or underscores, don't include
them in the repository name however, as they are illegal in R package names.
You'll use this single project for all of the exercises in this practical
series. If you are already in a coding group, then add the specific people that
need to be added to your repository on GitHub in Settings ->
Manage access -> Add people and just type in their GitHub
usernames (no @ sign). Give them read access. If you don't have a group yet,
ask the instructors which one you should join.
Create a new R file, using File > New > R
Script. You should adapt the code from Practical 1-5 (remember you now
have a project with all of this code in, and it is easy to have two RStudio
sessions so you can copy and refer to earlier code easily), creating 2 files --
0201-run-SIS.R
and 0201-step-SIS.R -- but where in 1-5
the data frame had only a count
column, now you will need to have two --
susceptibles
and infecteds
. You do not need to have the total population
size (N
, above) in there, because at every step it is just
susceptibles + infecteds
. In fact, with no death in this model, it is just
constant. You could for instance start things up with:
num.cattle <- 100 initial.infecteds <- 2 initial.susceptibles <- num.cattle - initial.infecteds
This allows us to easily keep the things we want to fixed, in this case the total population size, while allowing us to vary the things we are investigating. Writing the code to make this as easy as possible makes it harder to make mistakes, and that is always a good thing!
Once we've done that, we can create the data frame with those values -- this is the initial state of the population when we start the simulation:
herd.df <- data.frame(susceptibles = initial.susceptibles, infecteds = initial.infecteds)
Then we call the new step function with the current population and the parameters:
next.population <- step_deterministic_SIS(latest = tail(herd.df, 1), transmission.rate = ecoli.transmission, recovery.rate = ecoli.recovery)
If you are finding writing this code difficult, just ask, but remember that your function should:
Run the code in 0201-run-SIS.R. Remember
that the source()
command means that it automatically loads in
0201-step-SIS.R, meaning that you
don't need to load these files yourself. Don't forget to check that
your function is not using any global variables. Check that you get the same
results as you saw in the lecture. Adding the argument col = c("green", "red")
to plot.populations()
will allow you to change the colours of the (now) two
populations in the plot to match those from the lecture.
Remember from the lecture that the basic reproduction number, $R_0$, is given by
$$R_0 = \frac{\beta}{\sigma}$$
By changing the transmission rate, $\beta$, try running simulations for different values of $R_0$. What happens when $R_0 > 1$ and when $R_0 < 1$? Run the simulations for different values of the transmission rate (as suggested in the table below) with the default value of the recovery rate, $\sigma$, set to $\frac{1}{3}$. Calculate and record the corresponding value of $R_0$ using the formula, which we can rewrite as:
R0 <- ecoli.transmission / ecoli.recovery
Either using your plots or by asking R to
print out the data frame population
work out the proportion of the population
who are susceptible at equilibrium -- when the lines are flat -- and compare it
with the value of $\frac{1}{R_0}$ for one example value of the transmission
rate $\beta = 2/3$.
tmp <- vapply(2/3, function(x) as.character(MASS::fractions(x)), character(1)) table <- data.frame(tmp, NA, NA, NA) col <- c("Transmission <br> rate, $\\beta$", "Basic reproduction number, $R_0$", "Inverse $R_0$, <br> $\\frac{1}{R_0}$", "Proportion susceptible <br> at equilibrium, $\\frac{S}{N}$") col <- gsub("<", "<", col) col <- gsub(">", ">", col) colnames(table) <- col knitr::kable(table, escape = FALSE, align = c(rep('c', times = 4)))
Remember that throughout this series of exercises, the total population size, $N$ is fixed, but can be easily calculated from the different parts of the population so you never need to pass it as an argument or store it in the data frame.
Adapt your script to generate a report which shows you running the model with at least a couple of different transmission rates (including $2/3$) and perhaps try calculating different values for $R_0$ from the results with some explanation of what's going on. Note that for example for E. coli O157 infections in cattle, for timesteps in weeks, the recovery rate is about $\frac{1}{3}$ and the transmission rate is 1.
In this exercise, we want you to get a couple of other people to check your code and make sure it works for them, and we want you to check other people's code too.
Once you're happy with your code, commit your changes using the Git pane in
RStudio (just commit the R files - not, for instance, any
html files that were created by generating reports, though you may decide to
commit a spreadsheet with the table in as well). Don't forget then to push
the changes to GitHub and check on the website that it contains your new code.
Notify the partners in your group that you have something for them to
check. If you're not sitting next to them, then you can create an issue in your
repository asking for their review - you can contact
them by tagging them in the message with @theirusername
, and they should
receive an email and a notification on the website.
They should then create an issue, telling you if they had any problems
running your version of the practical.
Likewise, when you hear from someone else that a repo is available to check,
then open it using File > New Project...,
then Version Control, Git, and then put in the URL from GitHub and tick
Open in New Session
. In the new RStudio session that is opened, you can then
run their code and make sure it works. If it does not, just respond in the Issue
that they opened on GitHub explaining what went wrong (though it might be easier
to do this in person as well!). If it works, then you also need
to respond saying so. If you have already checked their repo once already, then
see the instructions in 2-2 for how to update your copy of it.
Remember that interacting like this through GitHub to help each other will count as your engagement marks for the course.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.