Nothing
#' Estimate what proportion of cases cause a certain proportion of
#' transmission
#'
#' @description Calculates the proportion of cases that cause a certain
#' percentage of transmission.
#'
#' It is commonly estimated what proportion of cases cause 80% of transmission
#' (i.e. secondary cases).
#' This can be calculated using `proportion_transmission()` at varying values of
#' \eqn{R} and for different values of percentage transmission.
#'
#' There are two methods for calculating the proportion of transmission,
#' \eqn{p_{80}} (default) and \eqn{t_{20}}, see `method` argument or details for
#' more information.
#'
#' @details
#' Calculates the expected proportion of transmission from a given
#' proportion of infectious cases. There are two methods to calculate this with
#' distinct formulations, \eqn{p_{80}} and \eqn{t_{20}} these can be specified
#' by the `method` argument.
#'
#' `method = p_80` calculates relative transmission heterogeneity
#' from the offspring distribution of secondary cases, \eqn{Z}, where the upper
#' proportion of the distribution comprise \eqn{x%} of total number of cases
#' given R0 and k, where \eqn{x} is typically defined as 0.8 or 80%. e.g. 80%
#' of all transmissions are generated by the upper 20% of cases, or
#' `p_80 = 0.2`, per the 80/20 rule. In this formulation, changes in R can
#' have a significant effect on the estimate of \eqn{p_80} even when k is
#' constant. Importantly, this formulation **does not** allow for true
#' homogeneity when `k = Inf` i.e. \eqn{p_{80} = 0.8}.
#'
#' `method = t_20` calculates a similar ratio, instead in terms of
#' the theoretical individual reproductive number and infectiousness given
#' R0 and k. The individual reproductive number, 'v', is described in
#' Lloyd-Smith JO et al. (2005), "as a random variable representing the
#' expected number of secondary cases caused by a particular infected
#' individual. Values for v are drawn from a continuous gamma probability
#' distribution with population mean R0 and dispersion parameter k, which
#' encodes all variation in infectious histories of individuals, including
#' properties of the host and pathogen and environmental circumstances." The
#' value of k corresponds to the shape parameters of the gamma distribution
#' which encodes the variation in the gamma-poisson mixture aka the negative
#' binomial
#'
#' For `method = t_20`, we define the upper proportion of infectiousness,
#' which is typically 0.2 i.e. the upper 20% most infectious
#' cases, again per the 80/20 rule. e.g. the most infectious 20% of cases,
#' are expected to produce 80% of all infections, or `t_20 = 0.8`. Unlike
#' `method = p_80`, changes in R have no effect on the estimate
#' of t_80 when k is constant, but R is still required for the underlying
#' calculation. This formulation **does** allow for true homogeneity when
#' k = Inf i.e. t_20 = 0.2, or t_80 = 0.8.
#'
#' Multiple values of R and k can be supplied and a `<data.frame>` of
#' every combination of these will be returned.
#'
#' The numerical calculation for `method = p_80` uses random number generation
#' to simulate secondary contacts so the answers may minimally vary between
#' calls. The number of simulation replicates is fixed to `r NSIM`.
#'
#' @inheritParams proportion_cluster_size
#' @inheritParams probability_epidemic
#' @param percent_transmission A `number` of the percentage transmission
#' for which a proportion of cases has produced.
#' @param method A `character` string defining which method is used to calculate
#' the proportion of transmission. Options are `"p_80"` (default) or `"t_20"`.
#' See details for more information on each of these methods.
#' @param simulate A `logical` whether the calculation should be done
#' numerically (i.e. simulate secondary contacts) or analytically. Default is
#' `FALSE` which uses the analytical calculation.
#'
#' @return A `<data.frame>` with the value for the proportion of cases for a
#' given value of R and k.
#' @export
#'
#' @references
#'
#' The analytical calculation is from:
#'
#' Endo, A., Abbott, S., Kucharski, A. J., & Funk, S. (2020)
#' Estimating the overdispersion in COVID-19 transmission using outbreak
#' sizes outside China. Wellcome Open Research, 5.
#' \doi{10.12688/wellcomeopenres.15842.3}
#'
#' The \eqn{t_{20}} method follows the formula defined in section 2.2.5 of the
#' supplementary material for:
#'
#' Lloyd-Smith JO, Schreiber SJ, Kopp PE, Getz WM. Superspreading and the
#' effect of individual variation on disease emergence. Nature. 2005
#' Nov;438(7066):355–9. \doi{10.1038/nature04153}
#'
#' The original code for the \eqn{t_{20}} method is from ongoing work
#' originating from \url{https://github.com/dcadam/kt} and:
#'
#' Adam D, Gostic K, Tsang T, Wu P, Lim WW, Yeung A, et al.
#' Time-varying transmission heterogeneity of SARS and COVID-19 in Hong Kong.
#' 2022. \doi{10.21203/rs.3.rs-1407962/v1}
#'
#' @examples
#' # example of single values of R and k
#' percent_transmission <- 0.8 # 80% of transmission
#' R <- 2
#' k <- 0.5
#' proportion_transmission(
#' R = R,
#' k = k,
#' percent_transmission = percent_transmission
#' )
#'
#' # example with multiple values of k
#' k <- c(0.1, 0.2, 0.3, 0.4, 0.5, 1)
#' proportion_transmission(
#' R = R,
#' k = k,
#' percent_transmission = percent_transmission
#' )
#'
#' # example with vectors of R and k
#' R <- c(1, 2, 3)
#' proportion_transmission(
#' R = R,
#' k = k,
#' percent_transmission = percent_transmission
#' )
proportion_transmission <- function(R, k,
percent_transmission,
method = c("p_80", "t_20"),
simulate = FALSE,
...,
offspring_dist,
format_prop = TRUE) {
missing_params <- missing(R) && missing(k)
.check_input_params(
missing_params = missing_params,
missing_offspring_dist = missing(offspring_dist)
)
method <- match.arg(method)
# check inputs
chkDots(...)
if (missing_params) {
checkmate::assert_class(offspring_dist, classes = "epiparameter")
R <- get_epiparameter_param(epiparameter = offspring_dist, parameter = "R")
k <- get_epiparameter_param(epiparameter = offspring_dist, parameter = "k")
}
checkmate::assert_numeric(R, lower = 0, finite = TRUE)
checkmate::assert_numeric(k, lower = 0)
# checkmate lower is >= 0 but k must be > 0
stopifnot("k must be greater than zero." = k != 0)
# k must be finite for analytical p_80 and t_20 methods
if (any(is.infinite(k))) {
k[is.infinite(k)] <- FINITE_INF
message(
"Infinite values of k are being approximated by ",
FINITE_INF, " for calculations."
)
}
checkmate::assert_number(percent_transmission, lower = 0, upper = 1)
stopifnot(
"percent_transmission must be less than 1." = percent_transmission != 1,
"percent_transmission must be greater than 0." = percent_transmission != 0
)
checkmate::assert_logical(simulate, any.missing = FALSE, len = 1)
checkmate::assert_logical(format_prop, any.missing = FALSE, len = 1)
if (simulate && method == "t_20") {
stop(
"The simulate argument must be FALSE when method = t_20.",
call. = FALSE
)
}
df <- expand.grid(R = R, k = k, NA_real_)
colnames(df) <- c("R", "k", paste0("prop_", percent_transmission * 100))
for (i in seq_len(nrow(df))) {
if (simulate) {
prop <- .prop_transmission_numerical(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
} else {
if (method == "p_80") {
prop <- .prop_transmission_analytical(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
} else {
prop <- .prop_transmission_t20(
R = df[i, "R"],
k = df[i, "k"],
percent_transmission = percent_transmission
)
}
}
if (format_prop) {
prop <- paste0(signif(prop * 100, digits = 3), "%")
}
# df is ways i x 3 so insert value into col 3
df[i, 3] <- prop
}
return(df)
}
#' Analytical calculation of proportion of cases that cause \eqn{x} percent
#' of transmission
#'
#' @return A numeric
#' @keywords internal
#' @noRd
.prop_transmission_analytical <- function(R, k, percent_transmission) {
xm1 <- stats::qnbinom(1 - percent_transmission, k + 1, mu = R * (k + 1) / k)
remq <- 1 - percent_transmission -
stats::pnbinom(xm1 - 1, k + 1, mu = R * (k + 1) / k)
remx <- remq / stats::dnbinom(xm1, k + 1, mu = R * (k + 1) / k)
x <- xm1 + 1
out <- 1 - stats::pnbinom(x - 1, k, mu = R) -
stats::dnbinom(x, k, mu = R) * remx
return(out)
}
#' Numerical calculation of proportion of cases that cause \eqn{x} percent
#' of transmission
#'
#' @return A numeric
#' @keywords internal
#' @noRd
.prop_transmission_numerical <- function(R, k, percent_transmission) {
simulate_secondary <- stats::rnbinom(
n = NSIM,
mu = R,
size = k
)
# percentage of cases
percent_cases <- percent_transmission * sum(simulate_secondary)
# cumulative sum of simulated secondary cases
cumsum_secondary <- cumsum(sort(simulate_secondary, decreasing = TRUE))
# proportion causing case
out <- sum(cumsum_secondary <= percent_cases) / NSIM
return(out)
}
#' Calculates the expected proportion of transmission from the upper `prop`%
#' of most infectious cases
#'
#' @return A `numeric`
#' @keywords internal
#' @noRd
.prop_transmission_t20 <- function(R, k, percent_transmission) {
k <- k %gt% 1e7
percent_transmission <- percent_transmission %gt% 0.99
u <- solve_for_u(prop = percent_transmission, R = R, k = k)
integral_result <- stats::integrate(
function(u) u * fvx(u, R, k), lower = 0, upper = u
)
res <- integral_result$value / R
return(1 - res)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.