library(driftSim)
library(ggplot2)
knitr::opts_chunk$set(echo = TRUE, fig.width=7,fig.height=6,
                      message=FALSE, results=FALSE,warning=FALSE)

1. Foreword

This is vignette describes the transmission model aspect of the binding avidity and antigenic drift simulation, driftSim. You should refer to the README for instructions on how to use the package - this document is just intended to describe the model itself.

2. SIR model

The heart of the simulation is a stochastic, agent based SIR model. The model can be described by the following set of ordinary differential equations:\

\begin{equation} \frac{dS}{dt} = \mu N - \beta SI - \mu S + \omega R\ \frac{dI}{dt} = \beta SI - (\mu + \gamma)I\ \frac{dR}{dt} = \gamma I - (\mu + \omega)R \end{equation}\

Where $S$ is the number of susceptible individuals; $I$ is the number of infected individuals; $R$ is the number of fully immune individuals; $\mu$ is the daily birth and death rate; $\beta$ is the transmission rate (see below); $\omega$ is the rate at which full, short term immunity wanes; and $\gamma$ is the recovery rate.

The model can be approximately described by a set of difference equations which are implemented via a "$\tau$-leap" algorithm to provide the stochastic, agent based form of the above SIR model. These equations are:

\ \begin{equation} S(t + \delta t) = S(t) + \delta M_B + \delta M_W - \delta M_T - \delta M_{D_S}\ I(t + \delta t) = I(t) + \delta M_T - \delta M_{D_I} - \delta M_R\ R(t + \delta t) = R(t) + \delta M_R - \delta M_W - \delta M_{D_R} \end{equation} \

Where the nomencalture of $\delta M_X$ describes the number of events of type $X$ that occur within a small, fixed time interval, $\delta t$. The number of events that occur are approximately Poisson distributed such that:

\ \begin{equation} \delta M_B \approx \text{Poisson}(\mu N \delta t)\ \delta M_T \approx \text{Poisson}(\beta SI \delta t)\ \delta M_{D_S} \approx \text{Poisson}(\mu S \delta t)\ \delta M_{D_I} \approx \text{Poisson}(\mu I \delta t)\ \delta M_{D_R} \approx \text{Poisson}(\mu R \delta t)\ \delta M_R \approx \text{Poisson}(\gamma I \delta t)\ \delta M_W \approx \text{Poisson}(\omega R \delta t) \end{equation} \

Please refer to Chapter 6 of Keeling and Rohani for more information.

3. Transmission probabilities

The core of this model is the transmission rate, $\beta$. This parameter is conditional on both properties of the host population and properties of the infecting virus. We describe here the parts of this parameter that are directly relevant to the tranmission model below, but readers should refer to the other vignette on within-host properties for a full description of the relationship between host immunity and binding avidity on transmission.

3.1 Host population parameters

There are two key properties of individual hosts that mediate transmission probability in this model:

Whilst the contact rate has a straightforward interpretation (average number of contacts made by an individual per day), the level of effective adaptive immunity against a given virus is slightly more complicated.

Hosts have antibody-mediated immunity $j$ against all viruses that they have been previously exposed to. If a host is infected and then recovers, we assume that that host generates an antibody boost of $k\sim\text{Poisson}(\psi - \frac{\psi}{j_{max}}j)$ units upon recovery. However, we assume that the level of boosting is inversely proportional to the amount of antibody already present, such that the realised boost is given by:

\begin{equation} j^{\prime} = j + \text{Poisson}(\psi - \frac{\psi}{j_{max}}j) \end{equation}

where $j^{\prime}$ is the antibody titer following recovery; $j$ is the antibody titer before recovery; $\sigma$ is the mean level of unmediated boost; and $j_{max}$ is the maximum achievable antibody titer. See Jacobson et al. 2015 for a gateway into the literature on antibody ceiling effects. Also refer to Kucharski et al. 2015 for antibody boosting in the context of epidemiological models.

For a completely naive individual where $\sigma = 3$, the distribution of boosts is:

 base <- ggplot(data.frame(x=c(0,3*3)),aes(x)) + stat_function(fun=dpois,geom="bar",colour="black",n=(3*3 + 1),args=list(lambda=3))+
                xlab("Magnitude of boost following infection") +
                ylab("Probability") +
                scale_y_continuous(expand=c(0,0))+
                scale_x_continuous(expand=c(0,0))+
                theme(
                    text=element_text(colour="gray20",size=14),
                    plot.title=element_text(size=28),
                    legend.text=element_text(size=20,colour="gray20"),
                    legend.title=element_text(size=20,colour="gray20"),
                    panel.grid.major = element_blank(),
                    panel.grid.minor = element_blank(),
                    axis.line=element_line(colour="gray20"),
                    axis.line.x = element_line(colour = "gray20"),
                    axis.line.y=element_line(colour="gray20"),
                    axis.text.x=element_text(colour="gray20"),
                    panel.background=element_blank(),
                    axis.text.y=element_text(colour="gray20"))
            base

3.2 Virus properties

The key feature of adaptive immunity in this model is that the total level of immunity a host has, $j$, is not fully effective against novel viruses, similar to the idea of antigenic cartography or antibody landscapes described by Fonville et al. 2014. In this model, we consider each unique infection (infectee strain) to generate a novel virus phenotype with some antigenic distance, $\delta$ to it's parent (infector strain). In terms of model implementation, this means that we record every individual virus that causes an infection; the host that this virus infects and the identity of the parent virus such that we can reconstruct an entire phylogeny.

We choose an arbitrary virus to represent the "root" of the virus phylogenetic tree, such that the "antigenic distance" of a given virus $i$, $\delta_i$ is given as the antigenic distance from $i$ to this root virus unless otherwise specified. Similarly, $\delta_{ij}$ denotes the antigenic distance between viruses $i$ and $j$. The antigenic distance to self is always 0, such that hosts have fully effective immunity against a strain that they have seen before.

In the basic form of the model where there is no antigenic drift, this antigenic distance is fixed at 0, such that exposure to any virus within a given phylogeny elicits full antibody-mediated immunity to all other viruses. In the case where we allow antigenic drift, we assume that once during an infection, there may be a mutation event, denoted $m$, with some probability denoted by $P(m)$. If $m$ occurs, an antigenic change is drawn from an exponential distribution such that the new antigenic distance to root is given by:

\begin{equation} \delta_{new} = \delta_{cur} + \text{Exp}(\Delta) \end{equation}

Where $\Delta$ is the mean of the exponential distribution. The resulting behaviour is that most infections result in very small antigenic changes, whereas occasional large antigenic jumps are observed; thereby rendering the host population effectively naive. The plot below depicts probability of mutations of a given size with an exponential rate parameter of 1.3.

attach(exampleParameters)
 base <- ggplot(data.frame(x=c(0,10)),aes(x)) + stat_function(fun=dexp,geom="line",colour="red",args=list(rate=1/viruspars["expDist"])) +
             xlab("Size of mutation") +
                ylab("Probability (given a mutation did occur)") +
                scale_x_continuous(expand=c(0,0))+
                theme(
                    text=element_text(colour="gray20",size=14),
                    plot.title=element_text(size=28),
                    legend.text=element_text(size=20,colour="gray20"),
                    legend.title=element_text(size=20,colour="gray20"),
                    panel.grid.major = element_blank(),
                    panel.grid.minor = element_blank(),
                    axis.line=element_line(colour="gray20"),
                    axis.line.x = element_line(colour = "gray20"),
                    axis.line.y=element_line(colour="gray20"),
                    axis.text.x=element_text(colour="gray20"),
                    panel.background=element_blank(),
                    axis.text.y=element_text(colour="gray20"))
  base

3.3 Effective immunity

We combine the properties of host immunity and antigenic distance to give a measure of effective immunity, $J$, for a given host against an infecting virus. When hosts recover from infection with a particular virus, $\upsilon_i$, the identity of this virus is stored in the host's infection history (in addition to the host receiving a boost in antibody titer). In the context of immunity, any given susceptible host, $i$ is therefore defined by the contents of their infection history, $\textbf{h}_i$, and their total homologous antibody titer, $j_i$:

\begin{equation} s : {\textbf{h}_i, j_i}\ \textbf{h}_i = {\upsilon_1, \upsilon_2, \dots \upsilon_n} \end{equation}

Upon contact with an infected host who is infected with virus $s$ ($\upsilon_s$), the level of effective immunity, $J_{is}$, host $i$ has against virus $\upsilon_s$ is given by the total antibody titer less the smallest antigenic distance between $\upsilon_s$ and all viruses in the host's immune history $\textbf{h}_i$:

\begin{equation} J_{is} = \text{max}(0,j_i - \text{min}(\delta_{li})), l \in \textbf{h}_i \end{equation}

This $J$ feeds into the "probability of evading immune system" term described in the accompanying vignette, which ultimately defines the level of infectiousness, $\rho$. The transmission rate is then defined as:

\begin{equation} \beta = c \cdot \rho \end{equation}

If a host is successfully infected with a virus, $\upsilon_s$, then this virus is added to the infection history such that:

\begin{equation} \textbf{h}_i^{new} = \textbf{h}_i \cup {\upsilon_s} \end{equation}

4. Virus phylogeny

Each successful infection creates a new virus from the infecting parent virus. In other words, the total number of distinct viruses is equal to the total incidence. As each virus is a direct descendent of its parent, the new infecting virus shares the properties of the parent at the time of infection, $t_0$. Five key properties are stored for each virus:

  1. The binding avidity at time of infection, $V_0$
  2. The current binding avidity, $V$
  3. The ID of the parent virus, $j$
  4. The antigenic distance from parent, $\delta_{ij}$ where $i$ is the virus and $j$ is the parent virus
  5. The ID of the host that the virus infected (for implementation purposes)

The full virus phylogeny can easily be reconstructed using property 3. Taking each extant virus, create a link to its parent virus, then create a link to its parent's parent etc. until the root virus is found. Similarly, the antigenic distance between any two viruses can be calculated using the following pseudocode:

```r{eval=FALSE}

Algorithm to find antigenic distance between two viruses

  1. distance <- 0
  2. A <- number of parent viruses to root of virus i
  3. B <- number of parent viruses to root of virus j
  4. Swap A and B such that A is the virus with fewest ancestors to root
  5. while A < B:
  6. p <- parent of i
  7. distance <- distance + distance between i and p
  8. i <- p
  9. A <- number of parent virsuses to root of i
  10. if i == j:
  11. return distance
  12. p_i <- parent of i
  13. p_j <- parent of j
  14. distance <- distance + (distance between i and p_i) + (distance between j and p_j)

Note that here, i == NULL implies that i has no parent virus (ie. is the root virus)

  1. while p_i != NULL && p_j != NULL && p_i != p_i:
  2. i <- p_i
  3. j <- p_j
  4. p_i <- parent of i
  5. p_j <- parent of j
  6. distance <- distance + (distance between i and p_i) + (distance between j and p_j)
  7. if(p_i == p_j == NULL) distance <- Inf
  8. return distance ```

5. Adaptive binding avidity

In addition to antigenic drift described above, the binding avidity of a given virus, $V_i$, adapts to level of immune selection pressure elicited by the host during the course of infection. We assume that in a given time step, $\delta t$, during infection, binding avidity changes proportional to the fitness gradient at the current binding avidity level. We define the fitness gradient as proportional to the derivative of the within-host reproductive number, $R_{in}$, with respect to binding avidity. We assume that the binding avidity trait moves "towards" the value that gives the greatest $R_{in}$ over time. Based on $f(J, V_i)$ and $g(V_i)$ as defined in the accompanying vignette, the rate of change of binding avidity with respect to time is given by:

\begin{equation} \frac{dV}{dt} = \frac{dR_{in}}{dV} = g(V_i)f'(J,V_i) + g'(V_i)f(J,V_i) \end{equation}

Where $J = j - \delta$ as defined above.

To avoid solving this equation numerically for every time step for each adapting virus, we pre-compute a table of expected binding avidity change for a set of given binding avidities and host immunity values within a small time step, $\delta t$.

In other words, after each time step the new binding avidity of a given virus is calculated as: \begin{equation} V_{new} = V_{cur} + k_co(\delta t, V_{cur}, J) \end{equation}

Where $o(\delta_t, V_{cur}, J)$ is found by solving the ODEs above for the time step $\delta t$ given a current binding avidity value $V_{cur}$ and level of effective host immunity $J$ as described above. $k_c$ is a constant of proportionality.

We approximate these continuous traits by pre-computing the amount of change in time step $\delta t$, $o$, for a set of discrete values of $J$ and $V$. We store these in a matrix $O$:

\begin{equation} O = \begin{bmatrix} o_{V_1,J_1} & o_{V_1,J_2} & o_{V_1,J_3} & \dots & o_{V_1,J_n} \ o_{V_2,J_1} & o_{V_2,J_2} & o_{V_2,J_3} & \dots & o_{V_,J_2n} \ \vdots & \vdots & \vdots & \ddots & \vdots \ o_{V_d,J_1} & o_{V_d,J_2} & o_{V_d,J_3} & \dots & o_{V_d,J_n} \end{bmatrix} \end{equation}

Where: \begin{equation} V_{d+1} = V_d + \delta V\ J_{n + 1} = J_{n} + \delta J\ J = J_n \text{ if } J_n \leq J < J_{n + 1}\ V_{cur} = V_d \text{ if } V_d \leq V_{cur} < V_{d + 1} \end{equation}

$\delta V$ and $\delta J$ are chosen to be sufficiently small to allow a good trade off between accuracy and run time/memory usage.

Finally, we add an optional extra mechanism to describe how antigenic changes may occur as a by-product of binding avidity adaptation, such that:

\begin{equation} \delta_{new} = \delta_{cur} + \xi|o_{V_{cur}, J}| \end{equation}

Where $\xi$ is a constant of proportionality. Furthermore, we note that $J$ will also change as $\delta$ changes.



jameshay218/driftSim documentation built on Jan. 24, 2020, 6:14 p.m.