Predict fish phenology: basic"
In hatchR: Predict Fish Hatch and Emergence Timing

# rmd style
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = FALSE,
  fig.align = "center",
  fig.width = 6
)
options(tibble.print_min = 5, tibble.print_max = 5)

# load packages
library(hatchR)
library(tibble)
library(ggplot2)
library(lubridate)

Overview

This vignette demonstrates a complete and simple hatchR workflow. We will use data installed with the package from Woody Island in Lake Iliamna, Alaska [@sparks2019], to predict the hatch and emergence timing of sockeye salmon (Oncorhynchus nerka).

If you have not already done so, see the vignettes on data checks and model parameterization.

First, we load needed packages:

library(hatchR)    # for phenology modeling
library(ggplot2)   # for additional plotting options
library(lubridate) # for date manipulation
library(tibble)    # for tidy tibbles and data manipulation
library(dplyr)     # for data manipulation

Data checks

First, let's check the data to make sure it's in the right format. We'll print the first few rows and summary() to get a sense of the data:

woody_island
summary(woody_island)

You can see that each day already is summarized as a mean temperature. You can also see that the data spans multiple years. We'll use plot_check_temp() to make visually check the data:

p <- plot_check_temp(
  data = woody_island,
  dates = date,
  temperature = temp_c
  )
p

There appear to be outliers in the data, specifically in the very beginning and end of the period of record, and right in the middle. The loggers look to have been recording observations outside of the water and then in the middle where the water level may have dropped and exposed it to the air.

However, spawning in this system typically peaks around August 18 and hatching and emergence are done before the start of the following spawning season, so we can predict phenology within a subset of a year. Because plot_check_temp() is a ggplot2 object, we can add additional plotting elements to the graph to represent the approximate phenology window:

p +
  geom_rect(
    aes(
      xmin = ymd("1990-08-18"),
      xmax = ymd("1991-04-01"),
      ymin = -10, 
      ymax = 25
    ), 
    fill = "gray",
    alpha = 0.01
  )

If we just apply the model within the gray polygon we've drawn, the model should work just fine.

Model select

Now that our temperature data is usable, we can select our models. We'll predict both hatch and emergence, so we will obtain a model expression for each using model_select(). The only argument in model_select() that must be changed in this case is development_type, which can be either "hatch" or "emerge".

sockeye_hatch_mod <- model_select(
  author = "Beacham and Murray 1990",
  species = "sockeye",
  model = 2,
  development_type = "hatch"
)

sockeye_emerge_mod <- model_select(
  author = "Beacham and Murray 1990",
  species = "sockeye",
  model = 2,
  development_type = "emerge"
)

Let's look at the model specifications for the two different parameterizations we've selected:

sockeye_hatch_mod
sockeye_emerge_mod

You can see they are parameterized slightly different to account for the differences between hatch and emergence timing.

Predict phenology

We can now use our model expressions to predict when sockeye would hatch and emerge at Woody Island in 1990. First we predict hatch timing using predict_phenology():

WI_hatch <- predict_phenology(
  data = woody_island,
  dates = date,
  temperature = temp_c,
  spawn.date = "1990-08-18",
  model = sockeye_hatch_mod
)

Note the warning message that appears when you run this code for woody_island. It reveals that while the fish developed, negative temperature values resulted in NaNs after development. More on this in the [Negative Temperature]{#neg_temp} section.

Next, look inside the returned object to see days to hatch and development period (more on this in the Understanding your results section)

WI_hatch$days_to_develop
WI_hatch$dev.period

We can also do the same with emergence:

WI_emerge <- predict_phenology(
  data = woody_island,
  dates = date,
  temperature = temp_c,
  spawn.date = "1990-08-18",
  model = sockeye_emerge_mod # notice we're using emerge model expression here
)

WI_emerge$days_to_develop
WI_emerge$dev.period

Understanding your results {#results}

The output from predict_phenology() includes a lot of information. Here is a summary of the elements in the output list.

summary(WI_hatch)

We can access each element using the $ operator:

WI_hatch$days_to_develop outputs the predicted days to hatch or emerge.
WI_hatch$dev.period is a 1x2 dataframe with the dates corresponding to when your fish's parent spawned (which you input with predict_phenology(spawn.date = ...)) and the date when the fish is predicted to hatch or emerge.
WI_hatch$ef.tibble is a n x 5 tibble (n = number of days to hatch or emerge) and the columns are a row index, the date, each day's temperature, effective value, and the cumulative sum of the effective values to that date. The ef.tibble object is meant to serve as the basis for users to make custom figures for their data beyond the functionality we discuss below.
WI_hatch$model.specs is a tibble showing the model specifications used to predict the phenology.

Visualizing phenology

hatchR has a built in function, plot_phenology(), that allows users to visualize their phenology results. The plot visualizes three specific components:

the temperature regime over which you are predicting
the cumulative sum of effective values
the effective value for each day in your prediction span

The function allows you to output various figures based on your interests, but defaults to a figure with all information and lots of labeling.

In the output of plot_phenology() the cumulative effective values are scaled by the warmest temperature in ef.tibble and the daily effective values are scaled by multiplying by 100 so everything is visibly congruent in the figure.

Let's look at the basic call, which gives you all the information:

plot_phenology(WI_hatch)

You can turn of labeling or plot specific values using the function arguments style and labels, for example (plots not rendered):

 # shows a plot with just the ef cumulative sum values
plot_phenology(WI_hatch, style = "ef_cumsum")
# shows a plot with just the ef daily values
plot_phenology(WI_hatch, style = "ef_daily") 
# turns off the labeling for a cleaner figure
plot_phenology(WI_hatch, labels = FALSE)

A note about negative temperatures {#negtemps}

Occasionally, temperature data sets will have a few negative values or values very close to 0. Negative numbers below a certain threshold will output "not a number" (NaN) effective values because they are undefined in the model expression and will break the function. Similarly, even negative values above that threshold will produce very small effective values. Because these values are so small, we allow the model to accumulate them even though development below 0 is biologically unlikely. We assume your data set has been checked for these values and doesn't include long periods of freezing, however the model allows for the occasional dip below freezing because the effect is so negligible toward development over incubation.

A toy example of this phenomenon is shown below.

# vector of temps from -5 to 15 by 0.5
x <- seq(from = -5, to = 15, by = 0.5)
x

# get effective values for those temperatures
# You can see the NaN warning that shows up in our past applications
demo_ef_vals <- eval(parse(text = sockeye_hatch_mod$expression))
demo_ef_vals

# bring together as a tibble
demo <- tibble(x, demo_ef_vals)
demo

# plot (note NaNs are removed from figure)
# rectangle added to highlight the approximate temperatures of interest
demo |>
  ggplot(aes(x = x, y = demo_ef_vals)) +
  geom_rect(
    aes(
      ymin = 0, ymax = 0.005,
      xmin = -5, xmax = 2
    ),
    fill = "dodgerblue", alpha = 0.25
  ) +
  geom_point() +
  geom_line() +
  labs(x = "Temperature (C)", y = "Effective Value") +
  theme_classic()