# rmd style knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE, message = FALSE, fig.align = "center", fig.width = 6 ) options(tibble.print_min = 5, tibble.print_max = 5) # load packages library(hatchR) library(tibble) library(ggplot2) library(lubridate)
This vignette demonstrates a complete and simple hatchR workflow. We will use data installed with the package from Woody Island in Lake Iliamna, Alaska [@sparks2019], to predict the hatch and emergence timing of sockeye salmon (Oncorhynchus nerka).
If you have not already done so, see the vignettes on data checks and model parameterization.
First, we load needed packages:
library(hatchR) # for phenology modeling library(ggplot2) # for additional plotting options library(lubridate) # for date manipulation library(tibble) # for tidy tibbles and data manipulation library(dplyr) # for data manipulation
First, let's check the data to make sure it's in the right format. We'll print the first few rows and summary()
to get a sense of the data:
woody_island
summary(woody_island)
You can see that each day already is summarized as a mean temperature. You can also see that the data spans multiple years. We'll use plot_check_temp()
to make visually check the data:
p <- plot_check_temp( data = woody_island, dates = date, temperature = temp_c ) p
There appear to be outliers in the data, specifically in the very beginning and end of the period of record, and right in the middle. The loggers look to have been recording observations outside of the water and then in the middle where the water level may have dropped and exposed it to the air.
However, spawning in this system typically peaks around August 18 and hatching and emergence are done before the start of the following spawning season, so we can predict phenology within a subset of a year. Because plot_check_temp()
is a ggplot2 object, we can add additional plotting elements to the graph to represent the approximate phenology window:
p + geom_rect( aes( xmin = ymd("1990-08-18"), xmax = ymd("1991-04-01"), ymin = -10, ymax = 25 ), fill = "gray", alpha = 0.01 )
If we just apply the model within the gray polygon we've drawn, the model should work just fine.
Now that our temperature data is usable, we can select our models. We'll predict both hatch and emergence, so we will obtain a model expression for each using model_select()
. The only argument in model_select()
that must be changed in this case is development_type
, which can be either "hatch" or "emerge".
sockeye_hatch_mod <- model_select( author = "Beacham and Murray 1990", species = "sockeye", model = 2, development_type = "hatch" ) sockeye_emerge_mod <- model_select( author = "Beacham and Murray 1990", species = "sockeye", model = 2, development_type = "emerge" )
Let's look at the model specifications for the two different parameterizations we've selected:
sockeye_hatch_mod sockeye_emerge_mod
You can see they are parameterized slightly different to account for the differences between hatch and emergence timing.
We can now use our model expressions to predict when sockeye would hatch and emerge at Woody Island in 1990. First we predict hatch timing using predict_phenology()
:
WI_hatch <- predict_phenology( data = woody_island, dates = date, temperature = temp_c, spawn.date = "1990-08-18", model = sockeye_hatch_mod )
Note the warning message that appears when you run this code for
woody_island
. It reveals that while the fish developed, negative temperature values resulted inNaN
s after development. More on this in the [Negative Temperature]{#neg_temp} section.
Next, look inside the returned object to see days to hatch and development period (more on this in the Understanding your results section)
WI_hatch$days_to_develop WI_hatch$dev.period
We can also do the same with emergence:
WI_emerge <- predict_phenology( data = woody_island, dates = date, temperature = temp_c, spawn.date = "1990-08-18", model = sockeye_emerge_mod # notice we're using emerge model expression here ) WI_emerge$days_to_develop WI_emerge$dev.period
The output from predict_phenology()
includes a lot of information. Here is a summary of the elements in the output list.
summary(WI_hatch)
We can access each element using the $
operator:
WI_hatch$days_to_develop
outputs the predicted days to hatch or emerge.
WI_hatch$dev.period
is a 1x2 dataframe with the dates corresponding to when your fish's parent spawned (which you input with predict_phenology(spawn.date = ...)
) and the date when the fish is predicted to hatch or emerge.
WI_hatch$ef.tibble
is a n x 5 tibble (n = number of days to hatch or emerge) and the columns are a row index, the date, each day's temperature, effective value, and the cumulative sum of the effective values to that date. The ef.tibble
object is meant to serve as the basis for users to make custom figures for their data beyond the functionality we discuss below.
WI_hatch$model.specs
is a tibble showing the model specifications used to predict the phenology.
hatchR has a built in function, plot_phenology()
, that allows users to visualize their phenology results. The plot visualizes three specific components:
The function allows you to output various figures based on your interests, but defaults to a figure with all information and lots of labeling.
In the output of plot_phenology()
the cumulative effective values are scaled by the warmest temperature in ef.tibble
and the daily effective values are scaled by multiplying by 100 so everything is visibly congruent in the figure.
Let's look at the basic call, which gives you all the information:
plot_phenology(WI_hatch)
You can turn of labeling or plot specific values using the function arguments style
and labels
, for example (plots not rendered):
# shows a plot with just the ef cumulative sum values plot_phenology(WI_hatch, style = "ef_cumsum") # shows a plot with just the ef daily values plot_phenology(WI_hatch, style = "ef_daily") # turns off the labeling for a cleaner figure plot_phenology(WI_hatch, labels = FALSE)
Occasionally, temperature data sets will have a few negative values or values very close to 0. Negative numbers below a certain threshold will output "not a number" (NaN
) effective values because they are undefined in the model expression and will break the function. Similarly, even negative values above that threshold will produce very small effective values. Because these values are so small, we allow the model to accumulate them even though development below 0 is biologically unlikely. We assume your data set has been checked for these values and doesn't include long periods of freezing, however the model allows for the occasional dip below freezing because the effect is so negligible toward development over incubation.
A toy example of this phenomenon is shown below.
# vector of temps from -5 to 15 by 0.5 x <- seq(from = -5, to = 15, by = 0.5) x # get effective values for those temperatures # You can see the NaN warning that shows up in our past applications demo_ef_vals <- eval(parse(text = sockeye_hatch_mod$expression)) demo_ef_vals # bring together as a tibble demo <- tibble(x, demo_ef_vals) demo # plot (note NaNs are removed from figure) # rectangle added to highlight the approximate temperatures of interest demo |> ggplot(aes(x = x, y = demo_ef_vals)) + geom_rect( aes( ymin = 0, ymax = 0.005, xmin = -5, xmax = 2 ), fill = "dodgerblue", alpha = 0.25 ) + geom_point() + geom_line() + labs(x = "Temperature (C)", y = "Effective Value") + theme_classic()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.