estimate_R: Estimated Instantaneous Reproduction Number 'estimate_R'...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/estimate_r.R

Description

Estimated Instantaneous Reproduction Number

estimate_R estimates the reproduction number of an epidemic, given the incidence time series and the serial interval distribution.

Usage

1
2
3
4
5
6
7
8
estimate_R(
  incid,
  method = c("non_parametric_si", "parametric_si", "uncertain_si", "si_from_data",
    "si_from_sample"),
  si_data = NULL,
  si_sample = NULL,
  config = make_config(incid = incid, method = method)
)

Arguments

incid

One of the following

  • A vector (or a dataframe with a single column) of non-negative integers containing the incidence time series

  • A dataframe of non-negative integers with either i) incid$I containing the total incidence, or ii) two columns, so that incid$local contains the incidence of cases due to local transmission and incid$imported contains the incidence of imported cases (with incid$local + incid$imported the total incidence). If the dataframe contains a column incid$dates, this is used for plotting. incid$dates must contains only dates in a row.

  • An object of class incidence

Note that the cases from the first time step are always all assumed to be imported cases.

method

One of "non_parametric_si", "parametric_si", "uncertain_si", "si_from_data" or "si_from_sample" (see details).

si_data

For method "si_from_data" ; the data on dates of symptoms of pairs of infector/infected individuals to be used to estimate the serial interval distribution This should be a dataframe with 5 columns:

  • EL: the lower bound of the symptom onset date of the infector (given as an integer)

  • ER: the upper bound of the symptom onset date of the infector (given as an integer). Should be such that ER>=EL. If the dates are known exactly use ER = EL

  • SL: the lower bound of the symptom onset date of the infected individual (given as an integer)

  • SR: the upper bound of the symptom onset date of the infected individual (given as an integer). Should be such that SR>=SL. If the dates are known exactly use SR = SL

  • type (optional): can have entries 0, 1, or 2, corresponding to doubly interval-censored, single interval-censored or exact observations, respectively, see Reich et al. Statist. Med. 2009. If not specified, this will be automatically computed from the dates

si_sample

For method "si_from_sample" ; a matrix where each column gives one distribution of the serial interval to be explored (see details).

config

An object of class estimate_R_config, as returned by function make_config.

Details

Analytical estimates of the reproduction number for an epidemic over predefined time windows can be obtained within a Bayesian framework, for a given discrete distribution of the serial interval (see references).

Several methods are available to specify the serial interval distribution.

In short there are five methods to specify the serial interval distribution (see help for function make_config for more detail on each method). In the first two methods, a unique serial interval distribution is considered, whereas in the last three, a range of serial interval distributions are integrated over:

R is estimated within a Bayesian framework, using a Gamma distributed prior, with mean and standard deviation which can be set using the 'mean_prior' and 'std_prior' arguments within the 'make_config' function, which can then be used to specify 'config' in the 'estimate_R' function. Default values are a mean prior of 5 and standard deviation of 5. This was set to a high prior value with large uncertainty so that if one estimates R to be below 1, the result is strongly data-driven.

R is estimated on time windows specified through the 'config' argument. These can be overlapping or not (see 'make_config' function and vignette for examples).

Value

an object of class estimate_R, with components:

Author(s)

Anne Cori a.cori@imperial.ac.uk

References

Cori, A. et al. A new framework and software to estimate time-varying reproduction numbers during epidemics (AJE 2013). Wallinga, J. and P. Teunis. Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures (AJE 2004). Reich, N.G. et al. Estimating incubation period distributions with coarse data (Statis. Med. 2009)

See Also

discr_si make_config

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
## load data on pandemic flu in a school in 2009
data("Flu2009")

## estimate the reproduction number (method "non_parametric_si")
## when not specifying t_start and t_end in config, they are set to estimate
## the reproduction number on sliding weekly windows                          
res <- estimate_R(incid = Flu2009$incidence, 
                  method = "non_parametric_si",
                  config = make_config(list(si_distr = Flu2009$si_distr)))
plot(res)

## the second plot produced shows, at each each day,
## the estimate of the reproduction number over the 7-day window 
## finishing on that day.

## to specify t_start and t_end in config, e.g. to have biweekly sliding
## windows      
t_start <- seq(2, nrow(Flu2009$incidence)-13)   
t_end <- t_start + 13                 
res <- estimate_R(incid = Flu2009$incidence, 
                  method = "non_parametric_si",
                  config = make_config(list(
                      si_distr = Flu2009$si_distr, 
                      t_start = t_start, 
                      t_end = t_end)))
plot(res)

## the second plot produced shows, at each each day,
## the estimate of the reproduction number over the 14-day window 
## finishing on that day.

## example with an incidence object

## create fake data
library(incidence)
data <- c(0,1,1,2,1,3,4,5,5,5,5,4,4,26,6,7,9)
location <- sample(c("local","imported"), length(data), replace=TRUE)
location[1] <- "imported" # forcing the first case to be imported

## get incidence per group (location)
incid <- incidence(data, groups = location)

## Estimate R with assumptions on serial interval
res <- estimate_R(incid, method = "parametric_si",
                  config = make_config(list(
                  mean_si = 2.6, std_si = 1.5)))
plot(res)
## the second plot produced shows, at each each day,
## the estimate of the reproduction number over the 7-day window
## finishing on that day.

## estimate the reproduction number (method "parametric_si")
res <- estimate_R(Flu2009$incidence, method = "parametric_si",
                  config = make_config(list(mean_si = 2.6, std_si = 1.5)))
plot(res)
## the second plot produced shows, at each each day,
## the estimate of the reproduction number over the 7-day window
## finishing on that day.

## estimate the reproduction number (method "uncertain_si")
res <- estimate_R(Flu2009$incidence, method = "uncertain_si",
                  config = make_config(list(
                  mean_si = 2.6, std_mean_si = 1,
                  min_mean_si = 1, max_mean_si = 4.2,
                  std_si = 1.5, std_std_si = 0.5,
                  min_std_si = 0.5, max_std_si = 2.5,
                  n1 = 100, n2 = 100)))
plot(res)
## the bottom left plot produced shows, at each each day,
## the estimate of the reproduction number over the 7-day window
## finishing on that day.

## Not run: 
## Note the following examples use an MCMC routine
## to estimate the serial interval distribution from data,
## so they may take a few minutes to run

## load data on rotavirus
data("MockRotavirus")

################
mcmc_control <- make_mcmc_control(
  burnin = 1000, # first 1000 iterations discarded as burn-in
  thin = 10, # every 10th iteration will be kept, the rest discarded
  seed = 1) # set the seed to make the process reproducible
 
R_si_from_data <- estimate_R(
  incid = MockRotavirus$incidence,
  method = "si_from_data",
  si_data = MockRotavirus$si_data, # symptom onset data
  config = make_config(si_parametric_distr = "G", # gamma dist. for SI
     mcmc_control = mcmc_control,
     n1 = 500, # number of posterior samples of SI dist.
     n2 = 50, # number of posterior samples of Rt dist.
     seed = 2)) # set seed for reproducibility

## compare with version with no uncertainty
R_Parametric <- estimate_R(MockRotavirus$incidence,
                          method = "parametric_si",
                          config = make_config(list(
                          mean_si = mean(R_si_from_data$SI.Moments$Mean),
                             std_si = mean(R_si_from_data$SI.Moments$Std))))
## generate plots
p_uncertainty <- plot(R_si_from_data, "R", options_R=list(ylim=c(0, 1.5)))
p_no_uncertainty <- plot(R_Parametric, "R", options_R=list(ylim=c(0, 1.5)))
gridExtra::grid.arrange(p_uncertainty, p_no_uncertainty,ncol=2)

## the left hand side graph is with uncertainty in the SI distribution, the
## right hand side without.
## The credible intervals are wider when accounting for uncertainty in the SI
## distribution.

## estimate the reproduction number (method "si_from_sample")
MCMC_seed <- 1
overall_seed <- 2
SI.fit <- coarseDataTools::dic.fit.mcmc(dat = MockRotavirus$si_data,
                 dist = "G",
                 init.pars = init_mcmc_params(MockRotavirus$si_data, "G"),
                 burnin = 1000,
                 n.samples = 5000,
                 seed = MCMC_seed)
si_sample <- coarse2estim(SI.fit, thin = 10)$si_sample
R_si_from_sample <- estimate_R(MockRotavirus$incidence,
                               method = "si_from_sample",
                               si_sample = si_sample,
                               config = make_config(list(n2 = 50, 
                               seed = overall_seed)))
plot(R_si_from_sample)

## check that R_si_from_sample is the same as R_si_from_data
## since they were generated using the same MCMC algorithm to generate the SI
## sample (either internally to EpiEstim or externally)
all(R_si_from_sample$R$`Mean(R)` == R_si_from_data$R$`Mean(R)`)

## End(Not run)

annecori/EpiEstim documentation built on May 4, 2021, 9:41 a.m.