library(knitr)

# knitr options
knitr::opts_chunk$set(
  collapse = TRUE,
  warning = FALSE,
  message = FALSE,
  echo = TRUE,
  comment = "#>"
)
# load necessary libraries
library(telemetyr)
library(dplyr)
library(ggplot2)

Introduction

We would like to estimate survival between each reach (the boundaries of which are defined by RT sites). To do so, we will use a Cormack Jolly-Seber (CJS) model. CJS models were originally developed to estimate survival between sampling occasions (survival through time). They estimate both the probability of survival from one sampling occasion to the next, as well as the probability of detecting an individual during a sampling occasion, assuming it is alive. The final survival and detection parameters are confounded in CJS models, meaning they cannot be estimated separately (only jointly).

Ecologists who work on species with directed migrations (movement in a single direction), such as salmon, have adapted CJS models to estimate survival through space rather than time. Sampling occasions have been replaced with detection sites, fixed in space, and now a CJS model can estimate survival between two detection sites, while accounting for imperfect detection at those sites. The data and model form are identical to CJS models estimating survival through time, it is merely the interpretation of the results that have changed.

There are a number of software packages available to fit CJS models with. MARK is a stand-alone software that encompasses many possible ways to analyze data from marked individuals. RMark is a package to connect R with MARK. Within R, there are packages such as marked, Rcapture and FSA can be used to fit CJS and Jolly-Seber (JS) models. However, all of these packages may not perform as expected when confronted with data like telemetry data, especially the sort where detections at the last few sites may be zero because the animal did not survive to that point, the animal did survive but went undetected OR survived but the tag stopped transmitting before it reached that point, and so went undetected.

We have found a Bayesian framework to provide the most straightforward method of dealing with issues like these, and will allow the user to build more complicated models on top of the basic one presented here. That is not to say other approaches are wrong, but we will present an example using the Bayesian framework here.

Data

To fit a CJS model, we need capture histories in a wide format, meaning one row for each tag, and a column for each detection site, with a 1 or a 0 depending on whether that tag was detected (or not) at that site. There may be tags that were never detected after release, and if those tags were turned on upon release (i.e. they were batch 1 tags), they should be included in our dataset. Tags that turned on after a period of time (i.e. batch 2 or 3 tags) can be included, if they were detected at least once upon turning on. This is equivalent to adding new marked animals to a CJS model at a time period > 1.

For this vignette, we have used an example dataset of fish released at the lower Lemhi trap (site code LLRTP), and possibly detected at five downstream sites, r paste(site_metadata$site_code[-c(1, nrow(site_metadata))], sep = ', ') and r site_metadata$site_code[nrow(site_metadata)]. This example dataset is called ch_wide and is included in the telemetyr package. The frequency of capture histories in this dataset are shown in the table below. The associated metadata for these tags, including their release site, type of duty cycle, etc. is also included in the telemetyr package, and is called tag_releases.

ch_wide %>%
  group_by(LLRTP, LH, CA, TR, RR, NF, cap_hist) %>%
  summarise(freq = n()) %>%
  kable()

Cormack Jolly-Seber Model

Assumptions

The assumptions behind a CJS model include:

  1. Every marked animal present in the population at sampling period $i$ has the same probability $p_i$ of being captured or resighted.
  2. Every marked animal present in the population at sampling period $i$ has the same probability $\phi_i$ of survival until sampling period $i+1$.
  3. Marks are neither lost nor overlooked and are recorded correctly.
  4. Sampling periods are instantaneous (in reality they are very short periods) and recaptured animals are released immediately.
  5. All emigration from the sampled area is permanent.
  6. The fate of each animal with respect to capture and survival probability is independent of the fate of any other animal.

Shortcut

We have included a wrapper function called fit_bayes_cjs() in the telemetyr package that accomplishes several steps at once (and calls other functions along the way):

  1. Write the JAGS model (write_bayes_cjs)
  2. Prepare the data (prep_jags_cjs)
  3. Run the Markov chain Monte Carlo, MCMC, algorithm (run_jags_cjs)
  4. Summarise posterior samples (summarise_jags_cjs)

To skip over the details, and move right to the results, you can run

param_summ = fit_bayes_cjs(file_path = "CJS_model.txt",
                           cap_hist_wide = ch_wide,
                           tag_meta = tag_releases)

From here, you can skip right to the [Results] section.

Details

To step through fitting this Bayesian CJS model, first we specify the JAGS model. For this example, we will fit a model with different detection probabilities for each site, and different survival probabilities between each site. We will also calculate the cumulative survival up to each site. We write this model as a text file, as shown below. The telemetyr package has a function to write this model, called write_bayes_cjs, where the user must specify the file path to save this text file.

jags_model = function() {
  # PRIORS
  phi[1] <- 1
  p[1] <- 1
  for(j in 2:J) {
    phi[j] ~ dbeta(1,1) # survival probability between arrays
    p[j] ~ dbeta(1,1)   # detection probability at each array
  }

  # LIKELIHOOD - Here, p and phi are global
  for (i in 1:N) {
    # j = 1 is the release occasion - known alive; i.e., the mark event
    for (j in 2:J) {
      # survival process: must have been alive in j-1 to have non-zero pr(alive at j)
      z[i,j] ~ dbern(phi[j] * z[i,j-1]) # fish i in period j is a bernoulli trial

      # detection process: must have been alive in j to observe in j
      y[i,j] ~ dbern(p[j] * z[i,j]) # another bernoulli trial
    }
  }

  # DERIVED QUANTITIES
  # survivorship is probability of surviving from release to a detection occasion
  survship[1] <- 1 # the mark event; everybody survived to this point
  for (j in 2:J) { # the rest of the events
    survship[j] <- survship[j-1] * phi[j]
  }
}
write_bayes_cjs(file_path = 'CJS_model.txt')

Next, we must prepare our data for this model, which is done using the function prep_jags_cjs(), which prepares a named list of data to feed to JAGS.

jags_data = prep_jags_cjs(ch_wide,
                          tag_releases)

The model requires several pieces of data:

In this example, to determine $N$, we must know how many batch 1 tags were released (we'll use all of those), as well as how many batch 2 or 3 tags were released and eventually detected at least once. $J$ is set at r jags_data$J sites. Once we know which tags will be used in the model, we can construct $y$ by filtering ch_wide to only include those tags. In this example, that leaves us with r nrow(jags_data$y) tags to use.

To construct $z$, the matrix of when each tag was known to be alive, the first and last detection of each fish is extracted, and the sites in between are filled in with 1's, since the fish must have been alive there in order to reach the last site it was detected at. Elements in the matrix after the last detection point for each fish are marked with NA's, since it is unknown whether fish did not survive to those sites or did survive but went undetected there. JAGS will estimate those unknown states as part of the MCMC algorithm. $z$ can be constructed using the function known_alive() in the telemetyr package. The vector $f$ is the first column in each row of $y$ where a 1 appears, and can be constructed using the first_alive() function.

The next step is to run the MCMC algorithm. We use the rjags package to connect R to JAGS and extract samples from the posteriors of each parameter, and do this with the function run_jags_cjs().

cjs_post = run_jags_cjs(file_path = 'CJS_model.txt',
                        jags_data = jags_data)

There are several default inputs to run_jags_cjs(), including:

Run with these defaults, this provides in r prettyNum(2500 / 5 * 4, big.mark = ',') samples from the posterior for each detection probability (p), survival (phi) and cumulative survival (survship) parameter. We have found this to be sufficient for most situations, but particular datasets may need higher iterations and/or more thinning. The output of this function is an mcmc.list object.

Finally, we summarise the posterior samples with the function summarise_jags_cjs. This uses the package postpack to create summaries of each parameter, and there are several inputs the user can adjust depending on what they would like to extract. If you use the wrapper fit_bayes_cjs() function, the site names are added to this summary dataframe, but a user could do the same by hand, using code like that listed below.

param_summ = summarise_jags_cjs(cjs_post)

param_summ %<>%
  left_join(tibble(site = colnames(jags_data$y)) %>%
                mutate(site = factor(site, levels = site),
                       site_num = as.integer(site))) %>%
    select(param_grp, site_num,
           site,
           param,
           everything())

Results

We can extract summary statistics from the posteriors, and construct plots of detection probablities, survival probabilities and cumulative survival probabilities.

surv_p = param_summ %>%
  filter(param_grp == 'survship') %>%
  filter(site_num < max(site_num)) %>%
  ggplot(aes(x = site,
             y = mean)) +
  geom_errorbar(aes(ymin = `2.5%`,
                    ymax = `97.5%`),
                width = 0) +
  geom_point() +
  theme_classic() +
  labs(x = 'Site',
       y = 'cumulative Survival')

phi_p = param_summ %>%
  filter(param_grp == 'phi') %>%
  filter(site_num < max(site_num)) %>%
  ggplot(aes(x = site,
             y = mean)) +
  geom_errorbar(aes(ymin = `2.5%`,
                    ymax = `97.5%`),
                width = 0) +
  geom_point() +
  theme_classic() +
  labs(x = 'Site',
       y = 'Survival From Previous Site')

det_p = param_summ %>%
  filter(param_grp == 'p') %>%
  filter(site_num < max(site_num)) %>%
  ggplot(aes(x = site,
             y = mean)) +
  geom_errorbar(aes(ymin = `2.5%`,
                    ymax = `97.5%`),
                width = 0) +
  geom_point() +
  theme_classic() +
  labs(x = 'Site',
       y = 'Detection Probability')
det_p
phi_p
surv_p

Diagnostics

We can also look at some diagnostics to assess model convergence. These plots use the postpack package.

postpack::diag_plots(cjs_post, "phi",
                     layout = "4x2")
postpack::diag_plots(cjs_post, "survship",
                     layout = "4x2")
postpack::diag_plots(cjs_post, "^p[",
                     layout = "4x2")

Assumptions and Caveats

Going back to the [assumptions in a CJS model][Assumptions], some of these are worth considering when dealing with multiple batches of tags that turn on/off at different times. In the context of assumption 1, "marked animals" include fish whose tags are currently operating. Telemtry tags that turn off before the fish has died could be considered lost, and if the tags are separated from the fish for some reason that would also be considered a lost mark. How prevalent that situation may be could inform how willing a user may be to make the assumption that all marks are not lost.

There may be ways to incorporate other types of marks (e.g. PIT tags) that could be detected further downstream, but how the detection probablities are estimated with tags that may turn off (and therefore allow fish to pass a telemetry site alive but undetected) should be investigated.

Caveats

Some of the caveats to keep in mind with CJS models include:

END CJS VIGNETTE

Jolly good work! That deserves a gin and tonic.



mackerman44/telemetyr documentation built on Feb. 15, 2025, 1:08 a.m.