estimR: Estimation of the reproduction number with...

View source: R/estimR.R

estimRR Documentation

Estimation of the reproduction number with Laplacian-P-splines

Description

This routine estimates the instantaneous reproduction number R_t; the mean number of secondary infections generated by an infected individual at time t (White et al. 2020); by using Bayesian P-splines and Laplace approximations (Gressani et al. 2022). Estimation of R_t is based on a time series of incidence counts and (a discretized) serial interval distribution. The negative binomial distribution is used to model incidence count data and P-splines (Eilers and Marx, 1996) are used to smooth the epidemic curve. The link between the epidemic curve and the reproduction number is established via the renewal equation.

Usage

estimR(incidence, si, K = 30, dates = NULL, maxmethod = c("NelderMead","HillClimb"),
CoriR = FALSE, WTR = FALSE, optimstep = 0.3, priors = Rmodelpriors())

Arguments

incidence

A vector containing the incidence time series. If incidence contains NA values at certain time points, these are replaced by the average of the left- and right neighbor counts. If the right neighbor is NA, the left neighbor is used as a replacement value.

si

The (discrete) serial interval distribution.

K

Number of B-splines in the basis.

dates

A vector of dates in format "YYYY-MM-DD" (optional).

maxmethod

The method to maximize the hyperparameter posterior distribution.

CoriR

Should the R_t estimate of Cori (2013) be also computed?

WTR

Should the R_t estimate of Wallinga-Teunis (2004) be also computed?

optimstep

Learning rate for the "HillClimb" method to maximize the posterior distribution of the hyperparameters.

priors

A list containing the prior specification of the model hyperparameters as set in Rmodelpriors. See ?Rmodelpriors.

Details

The estimR routine estimates the reproduction number in a totally "sampling-free" fashion. The hyperparameter vector (containing the penalty parameter of the P-spline model and the overdispersion parameter of the negative binomial model for the incidence time series) is fixed at its maximum a posteriori (MAP). By default, the algorithm for maximization is the one of Nelder and Mead (1965). If maxmethod is set to "HillClimb", then a gradient ascent algorithm is used to maximize the hyperparameter posterior.

Value

A list with the following components:

  • incidence: The incidence time series.

  • si: The serial interval distribution.

  • RLPS: A data frame containing estimates of the reproduction number obtained with the Laplacian-P-splines methodology.

  • thetahat: The estimated vector of B-spline coefficients.

  • Sighat: The estimated variance-covariance matrix of the Laplace approximation to the conditional posterior distribution of the B-spline coefficients.

  • RCori: A data frame containing the estimates of the reproduction obtained with the method of Cori et al. (2013).

  • RWT: A data frame containing the estimates of the reproduction obtained with the method of Wallinga-Teunis (2004).

  • LPS_elapsed: The routine real elapsed time (in seconds) when estimation of the reproduction number is carried out with Laplacian-P-splines.

  • Cori_elapsed: The routine real elapsed time (in seconds) when estimation of the reproduction number is carried out with the method of Cori et al. (2013).

  • penparam: The estimated penalty parameter related to the P-spline model.

  • K: The number of B-splines used in the basis.

  • NegBinoverdisp: The estimated overdispersion parameter of the negative binomial distribution for the incidence time series.

  • optimconverged: Indicates whether the algorithm to maximize the posterior distribution of the hyperparameters has converged.

  • method: The method to estimate the reproduction number with Laplacian-P-splines.

  • optim_method: The chosen method to to maximize the posterior distribution of the hyperparameters.

Author(s)

Oswaldo Gressani oswaldo_gressani@hotmail.fr

References

Gressani, O., Wallinga, J., Althaus, C. L., Hens, N. and Faes, C. (2022). EpiLPS: A fast and flexible Bayesian tool for estimation of the time-varying reproduction number. Plos Computational Biology, 18(10): e1010618.

Cori, A., Ferguson, N.M., Fraser, C., Cauchemez, S. (2013). A new framework and software to estimate time-varying reproduction numbers during epidemics. American Journal of Epidemiology, 178(9):1505–1512.

Wallinga, J., & Teunis, P. (2004). Different epidemic curves for severe acute respiratory syndrome reveal similar impacts of control measures. American Journal of Epidemiology, 160(6), 509-516.

White, L.F., Moser, C.B., Thompson, R.N., Pagano, M. (2021). Statistical estimation of the reproductive number from case notification data. American Journal of Epidemiology, 190(4):611-620.

Eilers, P.H.C. and Marx, B.D. (1996). Flexible smoothing with B-splines and penalties. Statistical Science, 11(2):89-121.

Examples

# Illustration on simulated data
si <- Idist(mean = 5, sd = 3)$pvec
datasim <- episim(si = si, endepi = 60, Rpattern = 5, dist="negbin", overdisp = 50)
epifit_sim <- estimR(incidence = datasim$y, si = si, CoriR = TRUE)
plot(epifit_sim, addfit = "Cori")

# Illustration on the 2003 SARS epidemic in Hong Kong.
data(sars2003)
epifit_sars <- estimR(incidence = sars2003$incidence, si = sars2003$si, K = 40)
tail(epifit_sars$RLPS)
summary(epifit_sars)
plot(epifit_sars)


oswaldogressani/EpiLPS documentation built on Oct. 25, 2024, 8:15 p.m.