knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
library(tidyr)
library(dplyr)
library(ggplot2)
library(squire.page)
This vignette describes the new fitting method where we optimise the $R_t$ values
for a set of given parameters.
Method Outline
Descriptive
Inputs are assumed to be a set of distributions for the non $R_t$ parameters and
a set of deaths that occur over set time periods, so $D_i$ deaths occur over the
time period $[t_{f,i}, t_{s,i}]$ for each $i \in 1$ to $T_D$. We assume that the
epidemic begins at time 0. We define our parameters to vary in the fitting
process as $\boldsymbol{R} = {R_t;\ t\in 0\mathrm{\ to\ }T_R}$. Each $R_t$ to
comes into effect at time $t_{R_t}$, by default this every 14 days with
$T_R = T_D - \max{\boldsymbol{\lambda}}$.
- Generate $N$ samples from our assumed distributions on our non $R_t$
parameters.
This will be discussed more in the section on Parameter Uncertainty.
- Determine which deaths will be used to fit each $R_t$. Using the structure of
the model we calculate the average time from infection to death (given death is
the outcome). This should match the delay between the $R_t$ value changing and
its effect on the modelled deaths. This value depends on the duration parameters
given to the model and on the number of hospital or ICU beds available, hence we
determine all 4 possible durations, $\boldsymbol{\lambda}$.
We define the relevant deaths for an $R_t$ by choosing all deaths that lie
within widest possible window that our $R_t$ value could directly effect, meaning
any death that has its entire time period occur within the bounds
$[t_{R_t} + \min(\boldsymbol{\lambda}), t_{R_{t+1}} + \max(\boldsymbol{\lambda})]$.
- For $R_0$ we set the bounds to $[0, t_{R_{1}} + \max(\boldsymbol{\lambda})]$,
and for $R_{T_R}$ we set the bounds to ${t_{R_{T_R}} + \min(\boldsymbol{\lambda}), t_{s,T_D}}$.
- For simplity we will define $\boldsymbol{i_{R_t}} = {i;\ t_{s,i} \geq t_{R_t} + \min(\boldsymbol{\lambda})\ \&\ t_{f,i} \leq \max(\boldsymbol{\lambda})}$, which is the indexes of all deaths that occur within
$R_t$'s fitting period
- For each distinct sample from the parameter distributions:
- Begin the estimation procedure by estimating $R_0$ and $I_0$, the
initial number of infections. $I_0$ is uniformly spread across the unvaccinated
working age population.
- When optimising $R_t$ we set upper and lower bounds on the values it can
take, which will be the hyper parameters $\alpha, \beta$ with $\alpha > \beta$
and $\alpha = 10, \beta = 0.5$ by default. To optimise $I_0$ we split a given
range of numbers into $N_p$ values, which we'll call $P_{I_0}$.
- To find our optimal values for $R_0$ and $I_0$ we optimise the log-likelihoods
for each element of $P_{I_0}$ in terms of $R_t$ using the Brent optmisation method.
The likelihood uses the negative binomial
distribution with a mean equal to the modelled deaths and $k$ a user defined
dispersion parameter on $D_i$, the deaths that fall within $R_0$ fitting window
as per step 2. We then take the $I_0$ and $R_t$ which the highest optimised
likelihood as our values for $R_0$ and $I_0$.
- Using $R_0$ and $I_0$ we calculate the state of the models compartments
at time $t_{R_1}$, which we will call $\boldsymbol{I_{R_1}}$.
- For each $R_t \in \boldsymbol{R}/R_0$:
- Choose $R_t$ by optimising the log-likelihood over all $D_i$ such that $i \in i_{R_t}$
- With $R_t$ and $\boldsymbol{I_{R_t}}$ calculate $\boldsymbol{I_{R_{t+1}}}$
which occurs at time $t_{R_t + 1}$.
- Should end up with a set $\boldsymbol{R}$ for each sample such that modelled
deaths are similar to the observed deaths.
Mathematical
Let $\boldsymbol{p}$ be a set of non-$R_t$ model parameters.
Then let $m_{I}(\boldsymbol{p}, \boldsymbol{I_{R_{t}}}, R_t, t)$ be a function that returns the model
state at time $t$, given the state of the model when $R_{t+1}$ comes into effect,
and $m_{d}(\boldsymbol{p}, \boldsymbol{I_{R_t}}, R_t, t_s, t_f)$ be a function that
produces the number of deaths in the model occuring over time period $[t_s, t_f]$.
For each $\boldsymbol{p}$:
- $$
{R_0, I_0} = \mathrm{argmax}{x,y \in [\alpha, \beta] \times \boldsymbol{P{I_0}}}\left(
\prod_{i \in \boldsymbol{i_{R_0}}}\frac{\Gamma(D_i + n_i)}{\Gamma(n_i)D_i!}(1-p_i)^{D_i}p_i^{n_i}
\right),
$$
where:
$$
p_i = \frac{k}{k + \mu_i},\ n_i = \frac{p_i \times \mu_i}{1-p_i} ,\ \ \mu_i = m_d(\boldsymbol{p}, y, x, t_{s,i}, t_{f,i}).
$$
- $\boldsymbol{I_{R_1}} = m_I(\boldsymbol{p}, I_0, R_0, t_{R_1})$, note that we
treat $I_0$ and $\boldsymbol{I_{R_0}}$ interchangeably here.
- For each element of ${t; t>0\ \&\ R_t \in [\alpha, \beta]}$
- $$
R_t = \mathrm{argmax}{x \in \boldsymbol{P{R_t}}}\left(
\prod_{i \in \boldsymbol{i_{R_t}}}\frac{\Gamma(D_i + n_i)}{\Gamma(n_i)D_i!}(1-p_i)^{D_i}p_i^{n_i} \right)
$$
where:
$$
p_i = \frac{k}{k + \mu_i},\ \ n_i = \frac{p_i \times \mu_i}{1-p_i},\ \ \mu_i = m_d(\boldsymbol{p}, \boldsymbol{I_{R_t}}, x, t_{s,i}, t_{f,i}).
$$
- $\boldsymbol{I_{R_{t+1}}} = m_I(\boldsymbol{p}, \boldsymbol{I_{R_{t}}}, R_t, t_{R_{t+1}})$
Diagram
knitr::include_graphics("./rt_optimise_example_annotated.png")
In this diagram, the black lines represent the estimate excess mortality data
and the coloured lines are the deaths produced by the chosen for each $R_t$.
Parameter Uncertainty
It is up to the user to setup the distributions on their parameters or just
pass the function and set of parameters they like.
In the lmic_reports these are drawn from distributions representative of the
epidemic, please see the methods page.
Limitations
It is necessary to bound $R_t$ above and below certain thresholds to avoid dramatically
high or low changes. So far all fits seem to work well with $R_t$ ranges around
$0.5$ or $0.9$ to $10$, though it is possible for there be a curve that cannot
be matched with this algorithm on simple bounds.
mrc-ide/squire.page documentation built on May 27, 2023, 11:20 a.m.