knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This R Markdown document walks through the steps of using Bayesian inference to create more realistic estimates of time-dependent reproductive number, $R(t)$ that can be helpful for surveillance and intervention planning of infectious diseases, like COVID-19.
There are two ways to use linelineBayes
: either you have caseCount data, which are aggregated case counts by day, or you have Line-list data means you have a single row for each case, that has dates for: infection, symptom onset, positive test, and when this was reported to public health agencies.
library(linelistBayes)
Step 1. Load data
data("sample_dates") data("sample_location") data("sample_cases") head(sample_dates) head(sample_cases) head(sample_location)
Step 2. Creating case-counts
caseCounts <- create_caseCounts(date_vec = sample_dates, location_vec = sample_location, cases_vec = sample_cases) head(caseCounts)
Plot
plot(caseCounts)
Get the first wave only, just to speed processing
caseCounts <- caseCounts[1:80, ] plot(caseCounts)
Step 3. Define the serial interval.
The si()
function creates a vector of length 14 with alpha and beta as defined in Li and White, for COVID-19.
sip <- si(14, 4.29, 1.18) plot(sip, type = 'l')
Step 4. Run the back-calculation algorithm. Metropolis-Hastings within Gibbs sampling is used. NB dispersion (also called 'size') is a measure of over-dispersion (smaller size means more over-dispersion, which means variance != mean). For more information see Zeileis.
NOTE: when you run backnow
on case Count data, you assume internally some distribution of onset (and missingness). The code will run if some (but not all) onset data are present. If either all onset data are NA or none are NA, the code will not run (because you have all the information you would be simulating). This is the reason for the reportF_missP = 0.6
argument. The implied onset distribution is rnbinom()
with size = 3
and mu = 9
.
out_list_demo <- run_backnow(caseCounts, MAX_ITER = as.integer(2000), norm_sigma = 0.2, sip = sip, NB_maxdelay = as.integer(20), NB_size = as.integer(6), printProgress = 1, reportF_missP = 0.6)
Plot outputs. The points are aggregated reported cases, and the red line (and shaded confidence interval) represent the back-calculated case onsets that lead to the reported data.
plot(out_list_demo, "est")
data("out_list_demo") # plot oldpar <- par(mfrow = c(1,2)) par(oma = c(0, 0, 0, 0), mar = c(4, 4, 1, 1)) plot(out_list_demo, "est") par(oldpar)
You can also plot the R(t)
curve over time. In this case, the red line (and shaded confidence interval) represent the time-varying r(t). See Li and White for description.
plot(out_list_demo, "rt")
data("out_list_demo") # plot oldpar <- par(mfrow = c(1,2)) par(oma = c(0, 0, 0, 0), mar = c(4, 4, 1, 1)) plot(out_list_demo, "rt") par(oldpar)
Same result if you use with line_list data"
data("sample_report_dates") data("sample_onset_dates")
create your caseCounts_line
object
caseCounts_line <- create_linelist(report_dates = sample_report_dates, onset_dates = sample_onset_dates) head(caseCounts_line)
and run
out_list_demo <- run_backnow(caseCounts_line, MAX_ITER = as.integer(2000), norm_sigma = 0.2, sip = sip, NB_maxdelay = as.integer(20), NB_size = as.integer(6), printProgress = 1)
You can specify the distribution for caseCounts_line in either run_backnow
or convert_to_linelist
(this runs internally depending on the class of input
). reportF
is the distribution function, _args
lists the distribution params that are not x
, and _missP
is the percent missing. This must be between ${0 < x < 1}$. Both 'caseCounts' and 'caseCounts_line' objects can be fed into run_backnow
.
my_linelist <- convert_to_linelist(caseCounts, reportF = rnbinom, reportF_args = list(size = 3, mu = 9), reportF_missP = 0.6) head(my_linelist)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.