Using `cyclomort`: A tool for estimating seasonal mortality patterns

require(cyclomort)
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = ">",
  fig.width = 6,
  fig.height = 4
)

The cyclomort package presents a selection of tools for fitting, exploring, and visualizing uni- or multi-modal periodic hazard functions to right-censored mortality data. Its development was motivated by the fairly common observation in wildlife populations that the risk of mortality is higher in certain times of year than others. The central function, fit_cyclomort, returns estimates of the timing, duration, and intensity of one or more "mortality seasons". Other functions summarize, visualize, and perform some statistical tests comparing models. The underlying model and its implementation are described in detail in Gurarie et al. (in review), and users are encouraged to read the paper for a deeper understanding of the likelihood based estimation of the parameters. This vignette illustrates the basic use of the most important functions.

To load the package:

library(cyclomort)

Underlying wrapped Cauchy hazard model

The underlying hazard model is based on a mixture of a modified wrapped Cauchy-like functions. Its simplest iteration is wc(), parameterized in terms of a mean mu, a concentration parameter rho which ranges from 0 to 1, and a periodicity tau. Its mean value is 1. In the examples below, there is a single mortality season that peaks on day 100 on a period of 365 days (i.e. April 10 in a calendar year):

oldpar <- par(no.readonly =TRUE); par(bty = "l", mar = c(4,4,2,2))
curve(wc(x, mu = 100, rho = .7, tau = 365), xlim = c(0,365), n = 1e4, ylab = "hazard", xlab = "time")
curve(wc(x, mu = 100, rho = .5, tau = 365), add = TRUE, col = 2)
curve(wc(x, mu = 100, rho = .3, tau = 365), add = TRUE, col = 3)
legend("topright", col = 1:3, legend = c(.7,.5,.3), title = "rho", lty = 1)
par(oldpar)

The mixture version mwc() takes vectors of the means and concentration parameters, as well as a vector mean hazard values gammas. The sum of gammas is the overall mean hazard. Several examples (on a periodicity of 1, with an overall mean hazard 1) are illustrated below:

oldpar <- par(no.readonly =TRUE); par(bty = "l", mar = c(4,4,2,2), xpd = NA)
mus <-  c(.3,.5,.9)
rhos <-  c(.5,.9,.7)
gammas <- c(.6,.1,.3)
curve(mwc(x, mus = mus[1], rhos = rhos[1], gammas = 1, tau = 1), 
      xlim = c(0,1), ylab = "hazard", xlab = "time")
curve(mwc(x, mus = mus[1:2], rhos = rhos[1:2], gammas = gammas[1:2]/sum(gammas[1:2]), tau = 1), 
      add = TRUE, col = 2)
curve(mwc(x, mus = mus, rhos = rhos, gammas = gammas, tau = 1), 
      add = TRUE, col = 3)
par(oldpar)

Note that in the three season model above, the second peak is quite pronounced (i.e. has the highest instantaneous hazard), but it accounts for only 1/10 of the total hazard since it has such a short duration.

Additionally there are functions for the integrals of these functions : iwc() and imwc() respectively. It is important to remember that these functions are not probability distributions (or, respectively, cumulative probability distributions) but positive recurring functions with an infinite support, the long-term mean of which is the sum of gamma vector.

This set of functions are mainly internal and used for fitting, though they can be useful for visualizing the shape of an underlying hazard. In subsequent analyses and applications, the parameter rho is replaced with a "season duration" parameter, and an additional parameter of the mean overall hazard. In turn, gammas are converted to "weight" parameters for each component, with all weights from each component summing to 1.

Simulating periodic mortality processes

The simulate_cycloSurv() function generates time-to-event data from an underlying periodic hazard process. Note that this function is parameterized via the (spelled-out) parameters: period, meanhazard, peaks, durations and weights. By default, this function plots the hazard, cumulative mortality, survival curve, and a histogram of simulated mortalities:

set.seed(1976)
T.morts.sim <- simulate_cycloSurv(n = 300, period = 365, meanhazard = 1/365, 
                             peaks = c(100, 250), durations = c(25, 40), 
                             weights = c(0.6, 0.4), plotme = TRUE,
                             max.periods = 5)

The function returns an object of the cyclomort specific class cycloSurv, which is a superclass of the Surv data type used widely in survival analysis in R:

T.morts.sim[1:10]
class(T.morts.sim)

Fitting a periodic hazard process

The fit_cyclomort() function estimates the parameters of the periodic hazard process. Note that we have to specify the numbers of seasons to fit.

T.morts.fit <- fit_cyclomort(T.morts.sim, n.seasons = 2)

The returned object is of class cmfit, which has summary, print and plot methods:

print(T.morts.fit)

From this summary, we see that the true values are within the 95% confidence intervals of the estimates.

The default plot:

plot(T.morts.fit, breaks = 30)

By default, the function plots a histogram of the data as well. It can be useful to plot only the fit and the confidence interval, for example to compare the 2 season fit with a 1 season and null fit, while suppressing the histogram, for example:

oldpar <- par(no.readonly =TRUE); par(bty = "l", mar = c(4,4,2,2))
T.morts.1season.fit <- fit_cyclomort(T.morts.sim, n.seasons = 1)
T.morts.null.fit <- fit_cyclomort(T.morts.sim, n.seasons = 0)
plot(T.morts.fit, histogram = FALSE, monthlabs = TRUE)
plot(T.morts.1season.fit, histogram = FALSE, add = TRUE, hazcolor = "red")
plot(T.morts.null.fit, histogram = FALSE, add = TRUE, hazcolor = "blue")
par(oldpar)

The plot of the fitted objects includes confidence intervals around the predictions. These are obtained numerically by generating a hazard curve many (by default: 5000) times using parameter value sampled from the MLE estimates using the Hessian to generate a variance covariance matrix across the parameters. This procedure is performed with the predict method (predict.cmfit function), which returns a set of predictions for a given fit at a time or set of times with a specific parameter sets.

Users can also call predict.cmfit to estimate the expected time to death for any individual at any given time. Note, this procedure takes a very (in fact, embarrassingly) long time even with relatively few samples (e.g. 100). We are working on this!

timetoeventprediction <- predict(T.morts.fit, t = 1:365, type = "timetoevent", CI = TRUE, nreps = 100)
data(timetoeventprediction)

The figure below illustrates the extent to which the expected time to an event depends on when an individual "enters" the process.

oldpar <- par(no.readonly =TRUE); par(bty = "l", mar = c(4,4,2,2))
with(timetoeventprediction, {
  plot(t, fit, type = "l", ylab = "Time to event", ylim = range(CI), lwd = 2)
  lines(t, CI[1,])
  lines(t, CI[2,])
})
par(oldpar)

Identifying the number of mortality seasons within a system using select_seasons

An important aspect of survival modelling is the ability to definitively characterize specific traits of a data set or population. Identifying the probable number of mortality seasons in a system is a powerful and useful claim that one can make using the select_seasons function. This function calls fit_cyclomort for a number of different seasons (ranging from the null model of 0 seasons to the max.season parameter), identifying the best model using Akaike Information Criterion (AIC).

The select_seasons function returns a cmfitlist object which is equipped with unique print and summary functions. While cmfitlist objects are simply lists of cmfit objects (each element of the list being the cmfit object representing a different seasonal assumption), the unique summary.cmfitlist function allows for easier comparison of the different fits.

T.select <- select_seasons(T.morts.sim, max.season = 4)

As expected, the select_seasons function identifies the two season model as the one with the lowest AIC value. Note, the likelihood maximization algorithm had some difficulty obtaining the Hessian matrix for the overly complex 4 season model, but (in our experience) the inference for model selection is still generally stable.

Simple factor analyses

The factorfit_cyclomort function allows users to test whether or not a categorical variable has an impact on seasonal mortality patterns, using R's formula notation. Input data for this function can be structured as a data frame with a categorical variable and a Surv (or cycloSurv) object in columns. There is a simulated dataset in the package called seasonalsex that illustrates a hypothetical example where factorial analysis may be useful:

data(seasonalsex)
str(seasonalsex)

The implementation is straightforward:

Survival.factorfit <- factorfit_cyclomort(event ~ sex, data = seasonalsex, n.seasons = 1)
summary(Survival.factorfit)

The analyses reports a p-value based on a likelihood ratio test of a model that includes the factor against a null model that does not, as well as reporting AIC values. A plotting method provides a visual comparison of the fits.

plot(Survival.factorfit, ymax = 1.2)

Data

Aside from the simulated sex-difference process, the package includes two anonymized and randomized caribou mortality data sets. The code below replicates the analyses presented in Gurarie et al. (in review).

Northwest Territory woodland caribou

The first is from woodland caribou (Rangifer tarandus caribou) collected throughout the Northwest Territories in Canada, loaded and plotted as follows:

data(nwt_morts)
require(ggplot2); require(magrittr); require(plyr)
ggplot(nwt_morts %>% arrange(start) %>% mutate(id = factor(id, levels = id)),
aes(x = start, y = id, col = status)) +
  geom_errorbarh(aes(xmin = start, xmax = end))

Note these data have been randomized by year, such that the seasonal patterns are retained.

These data need to be converted to a cycloSurv data type, which inherits from the Surv class used in survival and other survival modeling packages:

nwt_surv = with(nwt_morts,
                create_cycloSurv(start, end, event = status == "Mort", 
                                 period = 365, timeunit = "days"))

Note, it is important to specify the length of the period and time-unit.

Once these data are converted to the right class, they can be analyzed as in the examples above. For example, the Northwest Territories woodland caribou show three distinct seasons of higher mortality:

nwt_fit <- fit_cyclomort(nwt_surv, n.seasons = 3)
plot(nwt_fit, breaks = 40)
summary(nwt_fit)

Western Arctic herd caribou

The second data set is from the western Arctic herd (WAH) of migratory tundra caribou in western Alaska, U.S.:

data(wah_morts)
ggplot(wah_morts %>% arrange(start),
aes(x = start, y = id, col = fate)) + 
  geom_errorbarh(aes(xmin = start, xmax = end))

The western Arctic herd, with two pronounced seasons of mortality, showed an increase in mortality in the 2017-2018 winter season, which we analyze by converting to a cycloSurv objects and censoring and trimming the data in September 2017 using two convenient helper functions, censor_cycloSurv and trim_cycloSurv:

wah <- with(wah_morts, create_cycloSurv(start = start, end = end, 
                                        event = fate == "dead", period = 365))

cutoff <- "2016-09-01"
wah_pre = censor_cycloSurv(wah, censor.time = cutoff)
wah_post = trim_cycloSurv(wah, trim.time = cutoff)

wah_fit_pre <- fit_cyclomort(wah_pre, n.seasons = 1)
wah_fit_post <- fit_cyclomort(wah_post, n.seasons = 1)
summary(wah_fit_pre)
summary(wah_fit_post)
plot(wah_fit_pre, hist= FALSE, ymax = 0.003, monthlabs = TRUE)
plot(wah_fit_post, hist= FALSE, add = TRUE, hazcolor = "red")
par(oldpar)

Note that this result differs somewhat from that in Gurarie et al. (2020) since the data provided in the package are a subset of the complete data set, and too sparse to fit a more appropriate two season model.

References

E. Gurarie, P. Thompson, A. Kelly, N. Larter, W. Fagan and K. Joly (2020) For Everything There is a Season: Estimating periodic hazard functions with the cyclomort R package. Methods in Ecology and Evolution. 11(1): 129-138. https://doi.org/10.1111/2041-210X.13305.



Try the cyclomort package in your browser

Any scripts or data that you put into this service are public.

cyclomort documentation built on Aug. 20, 2020, 5:06 p.m.