estimate_beta_SI: Estimate time-varying transmission rates (SI method)

Description Usage Arguments Value Mock vital data Missing data References Examples

View source: R/estimate_beta_SI.R

Description

estimate_beta_SI() applies the SI method (see References) to estimate the time-varying transmission rate β(t) from time series of reported incidence, births, and natural mortality, observed at equally spaced time points tk = t0+kΔt (for k = 0,...,n), where Δt denotes the observation interval. Users are advised to smooth the transmission rate time series returned by estimate_beta_SI(). This can be accomplished using stats::loess() with an appropriate choice of span.

Usage

1
2
3
4
5
estimate_beta_SI(
  df = data.frame(),
  par_list = list(),
  method = c("trapezoid", "both")
)

Arguments

df

A data frame with numeric columns:

t

Time. t[i] is equal to ti−1 = t0+(i−1)Δt in units Δt, so that t[i] - t[i-1] is equal to 1.

C

Reported incidence. C[i] is the number of cases reported between times t[i-1] and t[i].

B

Births. B[i] is the number of births between times t[i-1] and t[i].

mu

Natural mortality rate. mu[i] is the rate at time t[i] expressed per unit Δt and per capita.

B is optional if hatN0 and nu are defined in par_list, and mu is optional if mu is defined in par_list (see Details).

par_list

A list containing:

prep

[ prep ] Case reporting probability.

trep

[ trep ] Case reporting delay in units Δt.

S0

[ S0 ] Number of susceptibles at time t = t0.

I0

[ I0 ] Number of infecteds at time t = t0.

hatN0

[ Ñ0 ] Population size at time t = 0 years.

nu

[ νc ] Birth rate expressed per unit Δt and relative to Ñ0 (if modeled as constant).

mu

[ μc ] Natural mortality rate expressed per unit Δt and per capita (if modeled as constant).

tgen

[ tgen ] Mean generation interval of the disease of interest in units Δt.

hatN0 and nu are optional if B is defined in df, and mu is optional if mu is defined in df (see Details).

method

Character vector of length 2. method[1] must be one of "forward", "backward", and "trapezoid" (recommended), indicating the method used to numerically integrate the ODE for susceptibles and infecteds: forward Euler, backward Euler, or trapezoidal. method[2] must be one of "forward", "backward", and "both" (recommended), indicating the method used to numerically integrate the ODE for cumulative incidence: forward Euler, backward Euler, or both. If both, then the transmission rate estimate is the average of the two estimates calculated via forward and backward Euler.

Value

A data frame with numeric columns:

t

Time. Identical to df$t.

C

Reported incidence, imputed. Identical to df$C, except with missing values imputed (see Details).

Z

Incidence. Z[i] is the estimated number of infections between times t[i-1] and t[i].

B

Births, imputed. Identical to df$B (if supplied), except with missing values and zeros imputed (see Details).

mu

Natural mortality rate, imputed. Identical to df$mu (if supplied), except with missing values imputed (see Details).

S

Number of susceptibles. S[i] is the estimated number of susceptibles at time t[i].

I

Number of infecteds. I[i] is the estimated number of infecteds at time t[i].

beta

Transmission rate. beta[i] is the estimated transmission rate at time t[i] expressed per unit Δt per susceptible per infected.

It possesses par_list and method as attributes.

Mock vital data

If df$B is undefined in the function call, then df$B[i] gets the value with(par_list, nu * hatN0 * 1) for all i. If df$mu is undefined the function call, then df$mu[i] gets the value with(par_list, mu) for all i.

Missing data

Missing values in df[, c("C", "B", "mu")] are not tolerated by the SI method. They are imputed via linear interpolation between observed values. If there are no observations before the first missing value, then complete imputation is impossible. In this case, the SI method may fail: columns S, I, and beta in the output may be filled with NA.

Strings of zeros in df$C may introduce spurious zeros and large numeric elements in column beta of the output. To prevent these errors, zeros in df$C are imputed like missing values. If there are no nonzero observations before the first zero, then complete imputation is impossible, and the output should be scanned for outliers.

References

deJonge MS, Jagan M, Krylova O, Earn DJD. Fast estimation of time-varying transmission rates for infectious diseases.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Simulate a reported incidence time series using
# a seasonally forced transmission rate
par_list <- make_par_list(dt_weeks = 1, epsilon = 0.5, prep = 0.5)
df <- make_data(
  par_list = par_list,
  n = 20 * 365 / 7, # 20 years is ~1042 weeks
  with_dem_stoch = TRUE,
  seed = 5
)
head(df)

# Estimate incidence, susceptibles, infecteds,
# and the seasonally forced transmission rate
df_SI <- estimate_beta_SI(df, par_list)
head(df_SI)

# Fit a smooth loess curve to the transmission rate
# time series
loess_fit <- loess(
  formula   = beta ~ t,
  data      = df_SI,
  span      = 53 / nrow(df_SI),
  degree    = 2,
  na.action = "na.exclude"
)
df_SI$beta_loess <- predict(loess_fit)

# Inspect
df_SI$t_years <- df$t_years
plot(S ~ t_years, df, type = "l", ylim = c(43, 58) * 1e03)
lines(S ~ t_years, df_SI, col = "red")
plot(beta ~ t_years, df, type = "l", ylim = c(0.95, 1.25) * 1e-05)
lines(beta_loess ~ t_years, df_SI, col = "red")

davidearn/fastbeta documentation built on June 14, 2020, 3:11 p.m.