The exponential distribution
In anovir: Analysis of Virulence

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(anovir)

Introduction {#top}

The mathematical models describing the epidemiology of infectious diseases commonly assume mortality rates remain constant over time. This assumption is sometimes justified by observed data, particularly if they cover relatively short periods of time compared to the organism's expected survival. However assuming constant rate of mortality is more frequently a convenience, making models easier to manipulate and interpret.

The hazard function of the exponential distribution describes constant rates of mortality. This is a special case of the Weibull distribution and occurs when b = 1;

\begin{align}

h(t) &= \frac{1}{bt} \exp \left( \frac{\log t - a}{b} \right) \ \ &= \frac{1}{t} \exp \left( \log t - a \right) \ \ &= \frac{1}{t} \frac{\exp \left(\log t \right)}{\exp \left(a \right)} \ \ &= \exp \left( -a \right) \ \ \end{align} where the first line is the hazard function for the Weibull distribution with location and scale parameters a and b, respectively. When b = 1, the rate of mortality described by the hazard function is constant at, h(t) = exp(-a).

The nll_functions of this package can estimate constant rates of mortality as described by the exponential distribution. This is achieved by choosing the Weibull distribution to describe the pattern of mortality in question and constraining the scale parameter to equal one when specifying the value of variables to be sent for maximum likelihood estimation.

Example: background rate of mortality constant

The following example uses a subset of data from Sy et al [@Sy_2014]; the full dataset is freely available at [@Sy_2014_data]. The data record the survival of blood-fed adult female Aedes aegypti mosquitoes, either infected or not with the microsporidian parasite Vavraia culicis, over the first three weeks of the adult life. Uninfected adult Ae. aegypti females can survive much longer than this in laboratory conditions.

sy_time <- c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23)

sy_Suninf <- c(1,0.988,0.988,0.969,0.951,0.941,0.934,0.929,0.92,0.913,0.906,0.903,0.896,0.889,0.88,0.875,0.875,0.868,0.858,0.846,0.842,0.837,0.837,0.831)

sy_Sobsinf <- c(1,0.979,0.966,0.942,0.915,0.902,0.887,0.875,0.869,0.859,0.841,0.81,0.788,0.747,0.732,0.696,0.68,0.652,0.616,0.6,0.584,0.574,0.561,0.545)

sy_surv <- data.frame(sy_time, sy_Suninf, sy_Sobsinf)

   plot(sy_surv[, 1], sy_surv[, 2],
     type = 's', col = 'black', xlab = 'time (days)', ylab = 'Survival',
     xlim = c(0, 23), ylim = c(0, 1),
     main = 'Cumulative survival',
     axes = FALSE
     )
     lines(sy_surv[, 3], type = 's', col = 'red')
     axis(side = 1, seq(0, 28, by = 7))
     axis(side = 2, seq(0, 1.1, by = 0.2))
     legend("bottomleft", c("Uninfected", "Infected"), lty = c(1, 1), col = c(1, 2))

  sy_times <- c(1,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,23,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,7,21,22,23,7,14,21,22,23)
  sy_censor <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1)
  sy_inf <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1)
  sy_fq <- c(5,8,8,4,3,2,4,3,3,1,3,3,4,2,3,4,5,2,2,1,7,4,8,9,4,5,4,2,3,6,10,7,13,5,11,5,9,11,5,5,3,4,2,3,4,205,143,6,7,4,103,66)

  data_sy <- data.frame(sy_times, sy_inf, sy_censor, sy_fq)

  data_sy$fq <- data_sy$sy_fq

In a first model, the Weibull distribution was chosen to describe both background mortality and mortality due to infection, d1 and d2, respectively,

    head(data_sy)

    m01_prep_function <- function(a1, b1, a2, b2){
      nll_basic(
        a1, b1, a2, b2,
        data = data_sy,
        time = sy_times,
        censor = sy_censor,
        infected_treatment = sy_inf,
        d1 = 'Weibull', d2 = 'Weibull')
        }

    m01 <- mle2(m01_prep_function,
             start = list(a1 = 3, b1 = 0.5, a2 = 3, b2 = 0.5)
             )

    summary(m01)

The estimate (± s.e.) for the scale parameter, b1, overlaps with 1.0, suggesting the exponential distribution was suitable for describing background mortality.

In a second model the scale parameter for background mortality b1 was constrained, or fixed, to b1 = 1.0 throughout the estimation process. Hence background mortality was estimated according to the exponential distribution.

    m02 <- mle2(m01_prep_function,
             start = list(a1 = 4.8, a2 = 3.6, b2 = 0.6),
             fixed = list(b1 = 1.0)
             )
    summary(m02)

  # compare models by AICc
     AICc(m01, m02, nobs = 753)

The log-likelihoods of the two models were very similar, indicating the data were not less well described when the exponential distribution was chosen to describe the background mortality. This was confirmed by the lower AICc of the second model, as fewer parameters were estimated for essentially the same log-likelihood indicating it was the better of the two models.