Confidence intervals


title: "Confidence intervals" output: rmarkdown::html_vignette bibliography: references.bib csl: proc_b.csl vignette: > %\VignetteIndexEntry{Confidence intervals} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} header-includes: \usepackage{amsmath} # devtools::install(build_vignettes = TRUE)


knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(anovir)

The negative log-likelihood (nll) functions in this package allow various survival models to be fit to observed survival data by maximum likelihood estimation. The actual variables estimated are the random variables defining the location and scale parameters of the survival models.

This vignette illustrates;

(i) how the estimated means, variances and covariances of these variables can be combined using the delta method to estimate the approximate variance and 95% confidence intervals of a hazard function, and

(ii) how a log-transformation of the hazard function can avoid the lower bounds of these confidence intervals from being less than zero.

Both approaches are used by the function anovir::conf_ints_virulence to estimate the confidence intervals of a hazard function describing a pathogen's virulence. This function, however, is limited to estimates arising from the default form of anovir::nll_basic where the location and scale parameters for background mortality and mortality due to infection are described by the variables, a1,b1 and a2,b2, respectively.

Three worked examples are provided where the number of random variables contributing to the parameters of the hazard function are 1, 2 or 3.

The second example estimates the 95% confidence intervals of a hazard function describing a pathogen's virulence. The two variables estimated are those defining the location and scale parameters for host mortality due to infection. This type of analysis is compatible with the function anovir::conf_ints_virulence and illustrates the sequence of steps taken by the function.

Finally, a worked example also shows how to calculate 95% confidence intervals on the natural scale for the results when they are calculated with nll_basic_logscale

The delta method

The delta method provides a means to estimate the approximate variance of a function when the function is a function of one or more random variables, and where there is an estimate for the variance of each random variable. For example if g is a function of f, where f is a function of the random variables, $X_1,X_2,\dots,X_n$, such that,

$$ g = f(X_1,X_2,\dots,X_n)$$ and estimates for the variance of each $X_i$ are available, then by the delta method the first order approximation for the variance of g is,

\begin{align}

\textrm{Var}\left(g\right) &\approx \textrm{Var}\left[f\left(X_1, X_2,\dots, X_n \right) \right] \ \ &= \sum_{i=1}^{n} \left( \frac{\partial f}{\partial X_i} \right)^2 \textrm{Var}\left( X_i \right) + 2 \sum_{i=1}^n \sum_{j=1}^{n} \left(\frac{\partial f}{\partial X_i}\right) \left(\frac{\partial f}{\partial X_j}\right) \textrm{Cov}\left(X_i, X_j \right)

\end{align}

where $\partial f / \partial X_i$ is the expression for the first partial derivative of $f(\cdot)$ with respect to $X_i$, while $\textrm{Var}(X_i)$ and $\textrm{Cov}(X_i,X_j)$ are terms for the variance and covariances of the random variables, respectively.

This first order approximation is equivalent to the function being described by a tangent line intersecting it at its mean. Higher order approximations are equivalent to describing the function around its mean with polynomial expressions to a higher degree. Such expressions are progressively more flexible and provide a better description of the function. However the added value of approximations of order two or above rapidly goes towards zero if the function is approximately linear about its mean. In such cases there is little benefit to going beyond a first order approximation.

Setting confidence interval bounds

The approximate nature of a first order approximation for the variance of a hazard function means it can yield lower estimates that go below zero. This situation can be remedied by a back-calculation of the confidence intervals estimated on a logscale.

Let L(t) be the log transformation of the hazard function h(t),

$$ L(t) = \log \left[h(t)\right] $$ Define the lower and upper bounds for $L\left(t\right)$, based on its estimator, $\hat{L}\left(t\right)$,

$$\left[\hat{L}(t)-A,\; \hat{L}(t)+A \right] $$ where A is the distance from the mean to the lower or upper bounds of the confidence interval. The back-transformation from the log-scale gives the lower and upperbounds as, $$\left[\exp\left(\hat{L}[t]-A\right),\; \exp\left( \hat{L}[t]+A \right) \right] $$ this equals,

$$\left[ \exp\left(\hat{L}\left[t\right]\right) \cdot \exp \left( -A \right),\; \exp\left( \hat{L}\left[t\right] \right) \cdot \exp\left( A \right) \right]$$

The back-transformation of $\hat{L}\left(t\right)$ is, $$ \hat{h}\left(t\right) = \exp \left(\hat{L}\left[t\right] \right)$$

which can be substituted into the bounds above to give,

$$\left[ \hat{h}\left(t\right) \cdot \exp \left(-A\right),\; \hat{h}\left(t\right) \cdot \exp\left( A\right) \right]$$ For a 95% confidence interval,

\begin{align}

A &= 1.96 \cdot \sqrt{\textrm{Var}\left[\hat{L}\left(t\right)\right]} \ &= 1.96 \cdot \sqrt{\textrm{Var}\left[\log\left(\hat{h}\left[t\right]\right)\right]}

\end{align}

Use the delta method to estimate $\textrm{Var}\left[\log\left(\hat{h}\left[t\right]\right)\right]$ as a function of $\hat{h}\left(t\right)$;

\begin{align}

X &= \hat{h}\left(t\right) \ Y &= \log \left(X\right) \ \textrm{Var}\left(Y\right) &\approx \textrm{Var}\left(\log\left[X\right]\right) \ \textrm{Var}\left(\log \left[X\right]\right)& = \left( \frac{\partial \log\left[X\right]}{\partial X}\right)^2 \cdot \textrm{Var}\left(X\right) \ \textrm{Var}\left(\log \left[\hat{h}\left(t\right)\right]\right)&= \left( \frac{1}{X}\right)^2 \cdot \textrm{Var}\left(X\right) \ & = \left( \frac{1}{\hat{h}\left(t\right)}\right)^2 \cdot \textrm{Var}\left(\hat{h}\left[t\right]\right)

\end{align}

Consequently the A term above becomes;

\begin{align}

A &= 1.96 \cdot \sqrt{\left( \frac{1}{\hat{h}\left(t\right)} \right)^2 \cdot \textrm{Var}\left(\hat{h}\left[t\right]\right)} \ &= \frac{1.96 \cdot \sqrt{\textrm{Var}\left[\hat{h}\left(t\right)\right]}}{\hat{h}\left(t\right)}

\end{align}

and the 95% confidence intervals,

$$\hat{h}\left(t\right) \cdot \exp \left( \frac{\pm 1.96 \cdot\sqrt{\textrm{Var}\left[\hat{h}\left(t\right)\right]}}{\hat{h}\left(t\right)} \right) $$


Worked example: 1 random variable

This example illustrates the estimation of the 95% confidence intervals of a hazard function based on a single random variable.

Sy et al. [@Sy_2014] recorded the survival of caged adult female Aedes aegypti mosquitoes over a three week period. The functions anovir::nll_basic and bbmle::mle2 were used to estimate patterns of mosquito mortality based on survival functions for the Weibull distribution.

The variables estimated were a1, b1, a2, b2 which defined the location and scale parameters for background mortality and mortality due to infection, respectively.

An initial analysis suggested the scale parameter for background mortality (b1) was close to one, such that, the background rate of mortality could be described by the exponential distribution. In this case the hazard function for background mortality at time t, $\textit{h1}\left(t\right)$ is,

$$\textit{h1} \left(t\right) = \frac{1}{\exp \left(\textit{a1}\right)}$$ where a1 is the location parameter and a random variable to be estimated.

In this case the approximate variance of the hazard function by the delta method is,

\begin{align} \textrm{Var}\left[\textit{h1}\left(t\right)\right] &\approx \left[ \frac{\partial \textit{h1}\left(t\right)}{\partial \textit{a1}}\right]^2 \textrm{Var}\left(\textit{a1}\right) \ \ &= \left[ \left( \frac{-1}{\exp \left[ \mu_\textit{a1} \right]} \right)^2 \right] \sigma^2_\textit{a1} \end{align}

where $\mu_{\textit{a1}}$ and $\sigma^2_{\textit{a1}}$ are the mean and variance of a1, respectively.

A revised model m02 set, b1 = 1.0, and gave the following estimates for parameters; a1, a2 and b2

  sy_times <- c(1,3,4,5,6,7,8,9,10,11,12,13,14,15,17,18,19,20,21,23,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,7,21,22,23,7,14,21,22,23)
  sy_censor <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1)
  sy_inf <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1)
  sy_fq <- c(5,8,8,4,3,2,4,3,3,1,3,3,4,2,3,4,5,2,2,1,7,4,8,9,4,5,4,2,3,6,10,7,13,5,11,5,9,11,5,5,3,4,2,3,4,205,143,6,7,4,103,66)

  data_sy <- data.frame(sy_times, sy_inf, sy_censor, sy_fq)

  data_sy$fq <- data_sy$sy_fq
    m01_prep_function <- function(a1, b1, a2, b2){
      nll_basic(
        a1, b1, a2, b2,
        data = data_sy,
        time = sy_times,
        censor = sy_censor,
        infected_treatment = sy_inf,
        d1 = 'Weibull', d2 = 'Weibull')
        }
    m02 <- mle2(m01_prep_function,
             start = list(a1 = 4.8, a2 = 3.6, b2 = 0.6),
             fixed = list(b1 = 1.0)
             )
  coef(m02)

and the variance-covariance matrix

  vcov(m02)

The lower and upper bounds of the 95% confidence intervals for the hazard function can be estimated as follows;

(i) based on the delta method

  mu_a1 <- coef(m02)[[1]] ; mu_a1
  var_a1 <- vcov(m02)[1,1] ; var_a1

  h1t <- 1/(exp(mu_a1)) ; h1t
  var_h1t <- (-1/exp(mu_a1))^2 * var_a1 ; var_h1t

  lower_ci <- h1t - 1.96*sqrt(var_h1t)
  upper_ci <- h1t + 1.96*sqrt(var_h1t)
  lower_ci ; h1t ; upper_ci

(ii) based on delta method and a log transformation of the hazard function

  lower_ci2 <- h1t*exp(-1.96*sqrt(var_h1t)/h1t)
  upper_ci2 <- h1t*exp(+1.96*sqrt(var_h1t)/h1t)

  lower_ci2 ; h1t ; upper_ci2

Worked example: 2 random variables

In this example the confidence intervals for virulence are estimated for a Fréchet hazard function based on three random variables.

An analysis of a subset of the data from Blanford et al [@Blanford_2012] found the virulence of three different isolates of the fungal pathogen \textit{Ma} were best described by hazard functions for the Fréchet distribution. Here the 95% confidence intervals for the virulence of strain Ma07 are estimated.

Part I: estimate parameter variables

library(anovir)

data01 <- data_blanford 

data01 <- subset(data01,
    (data01$block == 5) & (
      (data01$treatment == 'cont') |
      (data01$treatment == 'Ma07') 
    ) &
    (data01$day > 0)
    )

# create column 'g' as index of infected treatment
  data01$g <- data01$inf

  m01_prep_function <- function(a1, b1, a2, b2){
    nll_basic(a1, b1, a2, b2, data = data01, time = t, censor = censor,
              infected_treatment = g, d1 = 'Weibull', d2 = 'Fréchet')
    }

  m01 <- mle2(m01_prep_function,
              start = list(a1 = 2, b1 = 0.5, a2 = 3, b2 = 1)
              )

  summary(m01)

Part II: calculate confidence intervals

# view estimates
  coef(m01) ; vcov(m01)

# assign means, variances & covariances 
# means
  a2 <- m01@coef[[3]]
  b2 <- m01@coef[[4]]
# variances
  var_a2 <- m01@vcov[3,3]
  var_b2 <- m01@vcov[4,4]
# covariances
  cov_a2_b2 <- m01@vcov[3,4]

# specify timespan to calculate
  t <- 1:14

# write an 'expression' for the Fréchet hazard function
  # in terms of 'a2', 'b2' 
  hazard_expression <- expression(
    (1 / (b2 * t)) * exp(-((log(t) - a2) / b2) - exp(-((log(t) - a2) / b2))) / (1 - exp(-exp(-((log(t) - a2) / b2))))
  )

# prepare expression for 'stats::deriv` 
  # NB needs names of variables for partial differentiation 
  dh2_dt <- stats::deriv(hazard_expression, c('a2', 'b2'))

  dh2_dt

# evaluate (calculate) hazard function 'h2'
  h2 <- eval(dh2_dt)

# collect estimate of hazard function 'h2' 
# and 'gradients' of partial derivatives
  dh2 <- attr(h2, 'gradient')

# rename columns for clarity
  colnames(dh2) <- c('dh2_da2', 'dh2_db2')

# collect time, estimates of hazard function at time 't'
# and partial derivatives 
  dh2 <- cbind(t, h2, dh2)  

# calculate variance of hazard function
# using delta function  
  var_h2 <- 
    dh2[,'dh2_da2']^2 * var_a2 +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2'] * dh2[,'dh2_db2'] * cov_a2_b2

# calculate standard deviation of estimate
  sd_h2 <- sqrt(var_h2)

# bind to other estimates
  dh2 <- cbind(dh2, sd_h2)

# calculate lower & upper bounds of 95% confidence interval
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

# bind together
  dh2_ma07 <- cbind(dh2, lower_ci, upper_ci)
# plot data
  plot(dh2_ma07[,'t'], dh2_ma07[,'h2'],
    type = 'l', col = 'red', lty = 'solid',
    xlim = c(0, 14), ylim = c(0, 0.6),
    xlab = 'Time(days)', ylab = 'h(t) ±95% c.i.',
    main = 'Virulence: Ma07',
    axes = FALSE
    )
  axis(side = 1, seq(0, 14, by = 7))
  axis(side = 2, seq(0, 0.6, by = 0.1))
  lines(dh2_ma07[,'t'], dh2_ma07[,'lower_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )
  lines(dh2_ma07[,'t'], dh2_ma07[,'upper_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )

Part III: compare results with those of conf_ints_virulence

The calculations made in Part II of this example could have been done by anovir::conf_ints_virulence using the output of bbmle::mle2 for the location and scale parameters, their variance and covariance.

# values for a2, b2, var_a2, var_b2, cov_a2b2 were assigned 
# from results of 'bbmle::mle2' above
ls()

# tail of calculations above
tail(dh2_ma07)

# specify mle2object 'm01', Frechet distribution & maximum time
ci_matrix01 <- conf_ints_virulence(
  a2 = a2,
  b2 = b2,
  var_a2 = var_a2,
  var_b2 = var_b2,
  cov_a2b2 = cov_a2_b2,
  d2 = "Frechet", 
  tmax = 14)  

# tail of calculations from 'conf_ints_virulence'
tail(ci_matrix01)

# are the results the same? 
identical(dh2_ma07, ci_matrix01)

Worked example: 3 random variables

In this example the confidence intervals for virulence are estimated for a Weibull hazard function based on three random variables.

Analyses of the data from the study by Lorenz & Koella [@Lorenz_2011; @Lorenz_data_2011] suggested background mortality could be described by the Gumbel distribution and mortality due to infection by the Weibull distribution.

The location parameter describing mortality due to infection a2 was made a linear function of log[dose], requiring two random variables to estimated for its intercept and regression coefficient.

Part I: estimate parameter variables

library(anovir)

# get & rename data
  data01 <- data_lorenz 
  head(data01)

# create new version of function 'nll_basic'
  nll_basic2 <- nll_basic

# check location/definition of parameter function a2 ('pfa2')
  body(nll_basic2)[[4]]

# replace 'a2' making 'pfa2' a linear function of log(dose)
  body(nll_basic2)[[4]] <- substitute(pfa2 <- a2i + a2ii * log(data01$Infectious.dose))

# check new version of 'pfa2'
  body(nll_basic2)[[4]]

# update formals, including names for time, censor, etc..
  formals(nll_basic2) <- alist(a1 = a1, b1 = b1, a2i = a2i, a2ii = a2ii, b2 = b2,
    data = data01, time = t, censor = censored, infected_treatment = g,
    d1 = 'Gumbel', d2 = 'Weibull')

# prepare 'prep_function' for analysis
  m01_prep_function <- function(a1, b1, a2i, a2ii, b2){
    nll_basic2(a1, b1, a2i, a2ii, b2)
  }

# send to 'bbmle::mle2' with starting values for variables
  m01 <- mle2(m01_prep_function,
      start = list(a1 = 23, b1 = 4, a2i = 4, a2ii = -0.1, b2 = 0.2)
      )

# summary of results
  summary(m01)

Part II: calculate confidence intervals

# collect & assign results in mleobject 'm01'
  coef(m01)
  vcov(m01)

# means
           a2i <- m01@coef[[3]]
          a2ii <- m01@coef[[4]]
            b2 <- m01@coef[[5]]
# variances
       var_a2i <- m01@vcov[3,3]
      var_a2ii <- m01@vcov[4,4]
        var_b2 <- m01@vcov[5,5]
# covariances
  cov_a2i_a2ii <- m01@vcov[3,4]
    cov_a2i_b2 <- m01@vcov[3,5]
   cov_a2ii_b2 <- m01@vcov[4,5]

# specify timespan to calculate
  t <- 1:28

# specify dose treatment to calculate
  dose <- 5000

# write an 'expression' for hazard function
  hazard_expression <- expression(
    1 / (b2 * t) * exp(((log(t) - (a2i + a2ii * log(dose)))) / b2)
    )

# prepare expression for 'stats::deriv` 
# has names of variables for partial differentiation 
  dh2_dt <- stats::deriv(hazard_expression, c('a2i', 'a2ii', 'b2'))

# evaluate (calculate) hazard function 'h2'
  h2 <- eval(dh2_dt)

# collect estimate of hazard function 'h2' 
# and 'gradients' of partial derivatives
  dh2 <- attr(h2, 'gradient')

# rename columns for clarity
  colnames(dh2) <- c('dh2_da2i', 'dh2_da2ii', 'dh2_db2')

# collect time, estimates of hazard function at time 't'
# and partial derivatives 
  dh2 <- cbind(t, h2, dh2)

# calculate variance of hazard function
# using delta function  
  var_h2 <- 
    dh2[,'dh2_da2i']^2 * var_a2i +
    dh2[,'dh2_da2ii']^2 * var_a2ii +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2i'] * dh2[,'dh2_da2ii'] * cov_a2i_a2ii +
    2 * dh2[,'dh2_da2i'] * dh2[,'dh2_db2'] * cov_a2i_b2 +
    2 * dh2[,'dh2_da2ii'] * dh2[,'dh2_db2'] * cov_a2ii_b2 

# calculate standard deviation of estimate
  sd_h2 <- sqrt(var_h2)

# bind to other estimates
  dh2 <- cbind(dh2, sd_h2)

# calculate lower & upper bounds of 95% confidence interval
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

# bind together
  dh2_5000 <- cbind(dh2, lower_ci, upper_ci)

Repeat calculations for dose = 160000; not shown.

Plot results;

# specify dose treatment to calculate
  dose <- 160000

# write an 'expression' for hazard function
  hazard_expression <- expression(
    1 / (b2 * t) * exp(((log(t) - (a2i + a2ii * log(dose)))) / b2)
    )

# prepare expression for 'stats::deriv` 
# has names of variables for partial differentiation 
  dh2_dt <- stats::deriv(hazard_expression, c('a2i', 'a2ii', 'b2'))

# evaluate (calculate) hazard function 'h2'
  h2 <- eval(dh2_dt)

# collect estimate of hazard function 'h2' 
# and 'gradients' of partial derivatives
  dh2 <- attr(h2, 'gradient')

# rename columns for clarity
  colnames(dh2) <- c('dh2_da2i', 'dh2_da2ii', 'dh2_db2')

# collect time, estimates of hazard function at time 't'
# and partial derivatives 
  dh2 <- cbind(t, h2, dh2)

# calculate variance of hazard function
# using delta function  
  var_h2 <- 
    var_a2i * dh2[,'dh2_da2i']^2 +
    var_a2ii * dh2[,'dh2_da2ii']^2 +
    var_b2 * dh2[,'dh2_db2']^2 +
    2 * cov_a2i_a2ii * dh2[,'dh2_da2i'] * dh2[,'dh2_da2ii'] +
    2 * cov_a2i_b2 * dh2[,'dh2_da2i'] * dh2[,'dh2_db2'] +
    2 * cov_a2ii_b2 * dh2[,'dh2_da2ii'] * dh2[,'dh2_db2']

# calculate standard deviation of estimate
  sd_h2 <- sqrt(var_h2)

# bind to other estimates
  dh2 <- cbind(dh2, sd_h2)

# calculate lower & upper bounds of 95% confidence interval
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

# bind together
  dh2_160000 <- cbind(dh2, lower_ci, upper_ci)
# plot data
  plot(dh2_5000[,'t'], dh2_5000[,'h2'],
    type = 'l', col = 'red', lty = 'solid',
    xlim = c(0, 28), ylim = c(0, 1.2),
    xlab = 'Time(days)', ylab = 'h(t) ±95% c.i.',
    main = 'Virulence: dose 5000',
    axes = FALSE
    )
  axis(side = 1, seq(0, 28, by = 7))
  axis(side = 2, seq(0, 1.2, by = 0.2))
  lines(dh2_5000[,'t'], dh2_5000[,'lower_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )
  lines(dh2_5000[,'t'], dh2_5000[,'upper_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )

# plot data
  plot(dh2_160000[,'t'], dh2_160000[,'h2'],
    type = 'l', col = 'red', lty = 'solid',
    xlim = c(0, 28), ylim = c(0, 1.2),
    xlab = 'Time(days)', ylab = 'h(t) ±95% c.i.',
    main = 'Virulence: dose 160000',
    axes = FALSE
    )
  axis(side = 1, seq(0, 28, by = 7))
  axis(side = 2, seq(0, 1.2, by = 0.2))
  lines(dh2_160000[,'t'], dh2_160000[,'lower_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )
  lines(dh2_160000[,'t'], dh2_160000[,'upper_ci'],
    type = 'l', col = 'grey', lty = 'solid'
    )

Confidence intervals back-calculated from nll_basic_logscale

The nll_basic_logscale function assumes the values of the input variables are on a log-scale. Within the function these values are exponentiated before being sent to the likelihood function. The use of conf_ints_virulence will not provide the correct confidence intervals with the output from nll_basic_logscale.

The following code returns the same confidence intervals as that estimated for the same data without a log-transformation.

Part I: estimate parameter variables

library(anovir)

  data01 <- subset(data_blanford,
    (data_blanford$block == 3) &
      ((data_blanford$treatment == 'cont') |
      (data_blanford$treatment == 'Bb06')) &
    (data_blanford$day > 0)
    )

    l01_prep_function_log <- function(a1 = a1, b1 = b1, a2 = a2, b2 = b2){
      nll_basic_logscale(
        a1 = a1, b1 = b1, a2 = a2, b2 = b2,
        data = data01,
        time = t,
        censor = censor,
        infected_treatment = inf,
        d1 = 'Weibull', d2 = 'Weibull')
        }

    l01 <- mle2(l01_prep_function_log,
             start = list(a1 = 1, b1 = 1, a2 = 1, b2 = 1)
             )

    summary(l01)

Part II: calculate confidence intervals

# collect and assign results
  coef(l01)
  vcov(l01)

        a2 <- coef(l01)[[3]]
        b2 <- coef(l01)[[4]]

    var_a2 <- vcov(l01)[3,3]
    var_b2 <- vcov(l01)[4,4]

  cov_a2_b2 <- vcov(l01)[3,4]

# set timescale for calculations
  t <- 1:15

# NB need to exponentiate the terms with 'a2' and 'b2'
# in hazard expression
  hazard_expression <- expression(
    (1/(exp(b2) * t)) * exp((log(t) - exp(a2)) / exp(b2))
  )

# remaining steps are the same as in Example 2 above
  dh2_dt <- stats::deriv(hazard_expression, c('a2', 'b2'))
  h2 <- eval(dh2_dt)
  dh2 <- attr(h2, 'gradient')
  colnames(dh2) <- c('dh2_da2', 'dh2_db2')
  dh2 <- cbind(t, h2, dh2) 
  var_h2 <- 
    dh2[,'dh2_da2']^2 * var_a2 +
    dh2[,'dh2_db2']^2 * var_b2 +
    2 * dh2[,'dh2_da2'] * dh2[,'dh2_db2'] * cov_a2_b2
  sd_h2 <- sqrt(var_h2)
  dh2 <- cbind(dh2, sd_h2)
  lower_ci <- dh2[,'h2'] * exp(-1.96 * dh2[,'sd_h2'] / dh2[,'h2'])
  upper_ci <- dh2[,'h2'] * exp( 1.96 * dh2[,'sd_h2'] / dh2[,'h2'])

  dh2_backlog <- cbind(dh2, lower_ci, upper_ci)
  dh2_backlog <- round(dh2_backlog,4)
  dh2_backlog

  # the confidence intervals are the same as those
  # estimated on the help page of
  # `conf_ints_virulence` for variables that are 
  # not assumed to be on a logscale

References



Try the anovir package in your browser

Any scripts or data that you put into this service are public.

anovir documentation built on Oct. 24, 2020, 9:08 a.m.