In pbs-assess/hookCompetition: An R package for estimating relative abundance indices from longline data using a censored likelihood approach

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
knitr::opts_chunk$set(fig.width=12, fig.height=8)

library(hookCompetition)
library(gfiphc)
library(sf)
library(inlabru)
library(ggpolypath)
library(tidyverse)

colsc <- function(...) {
  ggplot2::scale_fill_gradientn(colours = rev(RColorBrewer::brewer.pal(11,"RdYlBu")),
                       limits = range(..., na.rm=TRUE))
}

How to use the hookCompetition package

This markdown document will outline the steps needed to use the hookCompetition package.

Step 1 - Setting parameters for generating the Rmarkdown document

# Set the number of cores to use to generate the vignette
n_cores <- 1
# Set the number of samples to use for bootstrapping for Monte Carlo approximations to
# posterior quantities. Should be set much higher for real applications!
n_samp <- 100

Step 2 - Loading the data.

The function read_Data_hookcomp() reads in longline catch count data collected by the IPHC. The arguments to the function are:

species_vec is a character vector of the one or more species desired for modelling (e.g. 'yelloweye rockfish')
at_PBS is a logical flag determining if the user has access to the gf data. If FALSE, yelloweye rockfish must be the specified species.
simple_species_vec is a character vector containing the desired shortened names of the species (e.g. 'yelloweye')
data_wd is a character vector stating the working directory where the rds files will be saved
min_dist is a numeric stating the minimum distance (km) required for a temporary IPHC station to be from a permanent IPHC station to be included. If 0, discard all temporary stations from analysis.
min_dist_to_boundary is a numeric stating the minimum distance from a survey boundary (km) required for an IPHC station to be included as data. If 0, then only include stations strictly within survey_boundaries
years_all_vec is a numeric vector stating the years all hooks were enumerated
years_20_vec is a numeric vector stating the years only 20 hooks were enumerated. Only data from years specified in the above two arguments are downloaded and included for later analysis.
survey_boundaries survey_boundaries is a sf polygons object containing user-specified survey boundaries for indices to be computed in. If NULL, then QCS, WCHG, WCVI, HS is used. The data.frame must contain a factor variable Region with levels equal to the names of the regions
preserve_inter_regional_differences Logical. If TRUE the relative magnitudes in catch rates between the subregions are preserved. If FALSE the catch rates are normalized separately within each subregion. Use the latter if differences in absolute abundances between regions are not desired and only temporal trends within each subregion are desired.

data <- hookCompetition::read_Data_hookcomp(
  species_vec = c("yelloweye rockfish",
            "arrowtooth flounder",
            "lingcod",
            "north pacific spiny dogfish",
            "pacific cod",
            "pacific halibut",
            "redbanded rockfish",
            "sablefish",
            "walleye pollock"),
  simple_species_vec = c("yelloweye",
            "arrowtooth",
            "lingcod",
            "dogfish",
            "cod",
            "halibut",
            "redbanded",
            "sablefish",
            "walleye"),
  at_PBS = F,
  data_wd = '../',
  min_dist = 50,
  min_dist_to_boundary = 0,
  years_all_vec = c(1995,1996,2003:2012, 2014:2019),
  years_20_vec = c(1997:2002, 2013, 2020, 2021),
  survey_boundaries=NULL
)

The function read_Data_hookcomp() returns a list containing two entries: - Reduced_data_sp is a sf points containing all the set-level catch counts of the species specified (variable name N_it_yelloweye). - survey_boundaries are the survey boundaries projected into the same CRS as the Reduced_data_sp object.

To put the objects in the list into the current working environment, run the following:

list2env(data, globalenv())
rm(data)

Note: to use the package on custom data, ensure the following variable names are used inside the sf points object containing the catch counts:

N_it_species (where species is the name of the species used (e.g. yelloweye)). This contains the catch counts of the species
effSkateIPHC contains a measure of effort (e.g. number of hooks used etc.,)
prop_removed contains the proportion of baits removed from the fishing event
year the year of the fishing event
region_INLA the region where the fishing event took place. An index (numeric variable) matching the corresponding row in survey_boundaries
station containing the index of the fishing station where the fishing event took place. Useful if the same locations are visited each year (for multiple years)
twentyhooks a bindary variable indicating years where a different number of hooks were used. Useful if 2 fishing protocols are used and differences want to be controlled for. Set to 0 if not relevant.

Step 3 - Fit the indices without the spatio-temporal random effects.

We derive bootstrap indices for a set of test species using the function bootstrap_index_fun_run. We derive an overall index and a subregion index (with subregions defined by the survey boundaries).

The function bootstrap_index_fun_run takes as arguments:

data a sf points object containing the IPHC data (typically the Reduced_data_sp object from read_Data_hookcomp)
species a character of the species (typically matching simple_species_vec from read_Data_hookcomp() linked to the variables N_it in the dataframe)
survey_boundaries a sf polygons object containing the survey boundary definitions (typically outputted from read_Data_hookcomp())
R specifies the number of bootstrap samples desired
ncpus specifies the number of cores to use in parallel
type specifies the type of bootstrap intervals to use (default percentile perc)
return when TRUE it returns the indices
plot when TRUE it returns the plot (can be memory costly when using large numbers of species).

bootstrap_ind <-
bootstrap_index_fun_run(species = 'yelloweye',
                          data = Reduced_data_sp,
                          survey_boundaries = survey_boundaries,
                          R=n_samp,
                          return=T,
                          ncpus=n_cores,
                          type='perc',
                          plot=T
                          )

The function bootstrap_index_fun_run returns a list containing 3 objects:

species_boot A data.frame containing the relative abundance indices by region
ind_plot A ggplot2 object containing the relatie abundance plot
preserve_inter_regional_differences A logical stating if the relative abundance indices can be compared between regions (if TRUE).

We derive hook-competition-adjusted bootstrap indices using the same function bootstrap_index_fun_run(), but adding the argument ICR_adjust=TRUE. We derive an overall index and a subregion index (with subregions defined by the survey boundaries).

bootstrap_ind_compfactor <-
bootstrap_index_fun_run(species = 'yelloweye',
                          data = Reduced_data_sp,
                          ICR_adjust=T,
                          survey_boundaries = survey_boundaries,
                          R=n_samp,
                          return=T,
                          ncpus=n_cores,
                          type='perc',
                          plot=T
                          )

Finally, we compute Poisson-lognormal model-based indices of relative abundance. These include the CPUE-based (unadjusted), hook-competition-adjusted (ICR-based), and censored likelihood-based indices.

To compute all three indices we use the function censored_index_fun. This function takes many arguments, many of which are optional. The non-optional arguments are:

data a sf points object containing the IPHC data
species a character name of the species (linked to the variables N_it in the dataframe)
survey_boundaries a sf polygons object containing the survey boundary definitions
M the number of independent Monte Carlo samples from the posterior used to compute indices
ICR_adjust A logical determining if the ICR-based scale-factor adjustment should be used
cprop the minimum proportion of baits needed to induce censorship. If cprop > 1, then the data are not censored.
use_upper_bound a logical stating if right-censored or interval-censored is desired
upper_bound_quantile a proportion stating which quantiles of observation to remove censorship from. If greater than 1, do not remove censorship from any observation
allyears logical determining if upper bound quantile is computed uniquely for each year (if False) or over all years (if TRUE)
preserve_inter_regional_differences Logical
return logical stating whether or not to return indices as a data.frame
plot logical

The optional arguments are:

nthreads how many cores do you want to run in parallel when fitting the model. Default 1
station_effects logical stating if IID station-station random effects wanted. Default TRUE
prior_event a prior distribution for the event-event IID random effect standard deviation (see HT.prior for default half t_7(0,1) distribution)
prior_station a prior distribution for the station-station IID random effect standard deviation (see HT.prior for default half t_7(0,1) distribution)
init_vals initial values for fitting model. If NULL (the default) let INLA choose them.
n_knots the number of knots used for the temporal 'spline'. Default is 8. More knots means a 'wigglier' temporal effect is allowed.
seed the seed used for Monte Carlo sampling. Note that 0 (the default) means the result is non-reproducible, but the code can be significantly faster.
verbose logical. If TRUE, show INLA diagnostics as the model fits. Can help to diagnose issues. Default FALSE
n_trajectories integer specifying how many Monte Carlo sampled relative abundance indices to plot on a spaghetti plot. This can help with interpreting uncertainty. Default value 10.
keep a logical stating if you want to return the INLA model object. It can use up memory quickly, but can be useful for diagnosing convergence issues and inspecting model fit. Default FALSE

We now fit all three indices

CPUE_ind <-
censored_index_fun(
  data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=1.1, nthreads=n_cores, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = c(0.228, -2.346, 2.407, -0.586), n_knots=8, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F
)

ICR_ind <-
censored_index_fun(
  data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=T, cprop=1.1, nthreads=n_cores, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = NULL, n_knots=8, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F
)

# The following model does not converge
# censored_ind <-
# censored_index_fun(
#   data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, nthreads=n_cores, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = NULL, n_knots=8, seed=0, verbose=F, n_trajectories=10
#     )

# Use upper bound
# Model now converges
censored_ind <-
censored_index_fun(
  data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, nthreads=n_cores, keep=F, use_upper_bound=TRUE, upper_bound_quantile=0.95, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = c(0.228, -2.346, 2.407, -0.586), n_knots=8, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F
    )

# We don't run the models in the vignette as INLA crashes GitHub Actions
# Instead, we load pre-made files here
list2env(hookCompetition::vignettes_precompiled_data, globalenv())

ggplot2::ggplot(CPUE_inddf, ggplot2::aes(x=.data$year ,y=.data$mean, ymin=.data$q0.025, ymax=.data$q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
        ggplot2::ggtitle(paste0('Overdispersed ','Poisson Index ','yelloweye'))

ggplot2::ggplot(ICR_inddf, ggplot2::aes(x=.data$year ,y=.data$mean, ymin=.data$q0.025, ymax=.data$q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
        ggplot2::ggtitle(paste0('Overdispersed ','ICR-Adjusted' ,' Poisson Index ','yelloweye'))

ggplot2::ggplot(censored_inddf, ggplot2::aes(x=.data$year ,y=.data$mean, ymin=.data$q0.025, ymax=.data$q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
        ggplot2::ggtitle(paste0('Overdispersed ','Censored' ,' Poisson Index ','yelloweye'), subtitle = paste0('censorship proportion ',0.95, ', censorship from data in upper ',max(0,(1-0.95)*100), '% of values removed'))

ggplot2::ggplot(CPUE_trajectory, ggplot2::aes(x=.data$year ,y=.data$mean, colour=.data$MC_ind, group=.data$MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
          ggplot2::ggtitle(paste0('Overdispersed ',' Poisson Index ','yelloweye')) +
          viridis::scale_color_viridis()

ggplot2::ggplot(ICR_trajectory, ggplot2::aes(x=.data$year ,y=.data$mean, colour=.data$MC_ind, group=.data$MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
          ggplot2::ggtitle(paste0('Overdispersed ','ICR-Adjusted', ' Poisson Index ','yelloweye')) +
          viridis::scale_color_viridis()

ggplot2::ggplot(censored_trajectory, ggplot2::aes(x=.data$year ,y=.data$mean, colour=.data$MC_ind, group=.data$MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~.data$region) +
          ggplot2::ggtitle(paste0('Overdispersed ','Censored' ,' Poisson Index ','yelloweye'),subtitle = paste0('censorship proportion ',0.95, ', censorship from data in upper ',max(0,(1-0.95)*100), '% of values removed')) +
          viridis::scale_color_viridis()

The function censored_index_fun() returns a list containing up to 8 entries:

pred_overdisp a data.frame object containing the annual relative abundance indices by region
trajectory_samples a data.frame object containing n_trajectories sampled trajectories (time series).
posterior_mean a matrix containing posterior samples of $\hat{\mu}_{i,k}(s, t)$ (i.e. posterior samples of the mean of the counts - useful for posterior predictive checking using the Bayesplot package)
posterior_sample a matrix containing posterior predictive samples of $\hat{N}_{i,k}(s, t)$ (i.e. posterior sampled counts from the posterior predictive distribution - useful for posterior predictive checking using the Bayesplot package)
index_plot contains the ggplot2 object of the sampled indices
trajectory_plot contains the ggplot2 object of the sampled trajectories
preserve_inter_regional_differences a logical stating if the indices are comparable across subregions (if TRUE)
mod the returned INLA model fit.

Note that the censored Poisson indices can also be fit using the sdmTMB package using the function censored_index_fun_sdmTMB. The arguments to the function are similar, with the exclusion of priors and n_knots due to penalized splines or first-order random walk structures being placed on the temporal effects. If the relative abundance trend is assumed to be smooth - perhaps for modelling a long-living species with small home range, then set the argument smoothing_spline=TRUE. If set to FALSE, then a random walk temporal structure is assumed. This allows for larger changes to occur each year (e.g. Pacific Hake summer abundance which changes dramatically from year-year due to summer migration).

Note that the lack of priors penalizing the likelihood on the IID random effects means that we must remove 1996 and 1997 from consideration within the censored Poisson model to make it converge.

We now fit all three indices within sdmTMB

CPUE_ind_sdmTMB <-
censored_index_fun_sdmTMB(
  data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=1.1, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F, time_effect = 'spline'
)

ICR_ind_sdmTMB <-
censored_index_fun_sdmTMB(
  data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=T, cprop=1.1, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F, time_effect = 'spline'
)

censored_ind_sdmTMB <-
censored_index_fun_sdmTMB(
  data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, keep=F, use_upper_bound=TRUE, upper_bound_quantile=0.95, plot=T, allyears=F, station_effects=T, seed=0, verbose=F, n_trajectories=10,
  preserve_inter_regional_differences = F, time_effect = 'spline'
    )

Bonus

The package also allows for spatio-temporal indices to be fit

Step 5 load the R objects required for fitting spatio-temporal models. These include: a delauney triangulation mesh for computing the GMRF approximation to the SPDE (cite Lindgren) within r-INLA (cite Rue); an INLA spde object with user-specified pc priors specified on the range and standard deviation parameters. Note that the spde object is a barrier model (cite Bakka) which accounts for land as a barrier.

The function make_spatial_objects() achieves this. It takes required arguments:

data a sf points object containing the IPHC data
survey_boundaries a sf polygons object containing the survey boundary definitions
prior_range Choose a value in kilometers such that the species density at two locations separated by distances greater than this will be `approximately' independent (typically a lower bound is chosen).
prior_range_prob What is you prior probability that the spatial range will be smaller than this value (typically a small probability is chosen to reduce the chances of over-fitting - see the Details section of ?INLA::inla.spde2.pcmatern for details)
prior_sigma Choose a value for the marginal standard deviation of the random field. Typically an upper bound is used. Remember our linear predictor is on the log scale, so think about constraining exp(+- 2*SD) in terms of multiplicative regional differences.
prior_sigma_prob What is your prior probability that the true marginal standard deviation exceeds this? (see the Details section of ?INLA::inla.spde2.pcmatern for details)

It takes optional arguments:

mesh_coast a simplified (smoothed) sf polygons object over which the triangulation mesh will be created. If NULL (default), a simplified shapefile of BC Coastline will be returned.
hires_coast a high-resolution sf polygons object used for plotting. If NULL (default), a default BC coastline object will be returned.
nx and ny How many columns of pixels to place over the survey_boundaries for plotting? Default 150, higher means smoother plots, but longer run times.

spatial_obj <- make_spatial_objects(
  data=Reduced_data_sp,
  survey_boundaries = survey_boundaries,
  mesh_coast = NULL, hires_coast = NULL,
  prior_range = 20, prior_range_prob = 0.01,
  prior_sigma = 2, prior_sigma_prob = 0.01
)

# load it into the global envirionment
list2env(spatial_obj, globalenv())
rm(spatial_obj)

ggplot2::ggplot() + inlabru::gg(mesh) + inlabru::gg(as_Spatial(Reduced_data_sp))

The function make_spatial_objects() returns a list containing X entries:

spde_mod an inla.spde2.pcmatern object used for setting up the GMRF-SPDE model
mesh an inla.mesh.2d object containing the triangulation mesh used for the random fields
triangle_ind currently NULL
Coast_hires sf polygons containing the high resolution coastline used for plotting
Coast_simp sf polygons containing the low resolution coastline used for building mesh (e.g. no islands or other land barriers)
predict_pixels sf points (pixels) object containing a regular grid of locations across survey_boundaries used for plotting and estimating the integrated intensity values for computing indices.

Step 6 Fit the spatio-temporal indices

The following code would be run to make spatio-temporal indices of yelloweye abundance. We do not run the code to save time whilst generating this vignette. Instead, we load pre-compiled objects. Note, we removed map predictions from all years except 2000 to save space.

The function spatiotemp_censored_index_fun() is used to fit the spatio-temporal indices. It takes required arguments (the new ones are described):

data a sf points object containing the IPHC data
species a character name of the species (linked to the variables N_it in the dataframe)
survey_boundaries a sf polygons object containing the survey boundary definitions
M the number of independent Monte Carlo samples from the posterior used to compute indices
ICR_adjust A logical determining if the ICR-based scale-factor adjustment should be used
cprop the minimum proportion of baits needed to induce censorship. If cprop > 1, then the data are not censored.
use_upper_bound a logical stating if right-censored or interval-censored is desired
upper_bound_quantile a proportion stating which quantiles of observation to remove censorship from. If greater than 1, do not remove censorship from any observation
allyears logical determining if upper bound quantile is computed uniquely for each year (if False) or over all years (if TRUE)
preserve_inter_regional_differences Logical
return logical stating whether or not to return indices as a data.frame
plot logical
mesh An INLA mesh object (e.g. mesh created in make_spatial_objects())
spde An INLA spde object (e.g. spde_mod created in make_spatial_objects())
pixels A sf points (pixels) object used for plotting the maps of relative abundance

The optional arguments are:

nthreads how many cores do you want to run in parallel when fitting the model. Default 1
station_effects logical stating if IID station-station random effects wanted. Default TRUE
prior_event a prior distribution for the event-event IID random effect standard deviation (see HT.prior for default half t_7(0,1) distribution)
prior_station a prior distribution for the station-station IID random effect standard deviation (see HT.prior for default half t_7(0,1) distribution)
init_vals initial values for fitting model. If NULL (the default) let INLA choose them.
n_knots the number of knots used for the temporal 'spline'. Default is 8. More knots means a 'wigglier' temporal effect is allowed.
seed the seed used for Monte Carlo sampling. Note that 0 (the default) means the result is non-reproducible, but the code can be significantly faster.
verbose logical. If TRUE, show INLA diagnostics as the model fits. Can help to diagnose issues. Default FALSE
n_trajectories integer specifying how many Monte Carlo sampled relative abundance indices to plot on a spaghetti plot. This can help with interpreting uncertainty. Default value 10.
keep a logical stating if you want to return the INLA model object. It can use up memory quickly, but can be useful for diagnosing convergence issues and inspecting model fit. Default FALSE
covs Currently not implemented. Please let me know if (spatio-temporal) environmental covariate functionality is wanted in a future version.
spatiotemporal Logical. Do you want to identify temporal changes in relative abundance uniquely across space? Almost always yes (default TRUE).

Note that these models can take a while to run (5-30 minutes). We advise you to set verbose=T so that you can assure yourself that something useful is happening behind the scenes :)

spatio_temporal_CPUE_ind <-
  spatiotemp_censored_index_fun(data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=1.1, nthreads=n_cores, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = c(0.385, -2.083, 2.711, -0.861, 4.000, 0.200, 3.400, 0.300, 2.938), n_knots=8, seed=0, mesh=mesh, spde=spde_mod, pixels=predict_pixels, covs=NULL, spatiotemporal=T, verbose = F, n_trajectories = 10,
  preserve_inter_regional_differences = F)


spatio_temporal_ICR_ind <-
  spatiotemp_censored_index_fun(data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=T, cprop=1.1, nthreads=n_cores, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = NULL, n_knots=8, seed=0, mesh=mesh, spde=spde_mod, pixels=predict_pixels, covs=NULL, spatiotemporal=T, verbose = F, n_trajectories = 10,
  preserve_inter_regional_differences = F)

spatio_temporal_censored_ind <-
  spatiotemp_censored_index_fun(data=Reduced_data_sp, survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, nthreads=n_cores, keep=F, use_upper_bound=TRUE, upper_bound_quantile=0.95, plot=T, allyears=F, station_effects=T, prior_event=HT.prior(), prior_station=HT.prior(), init_vals = NULL, n_knots=8, seed=0, mesh=mesh, spde=spde_mod, pixels=predict_pixels, covs=NULL, spatiotemporal=T, verbose = F, n_trajectories = 10,
  preserve_inter_regional_differences = F)

The function spatiotemp_censored_index_fun() returns a list containing up to 8 entries:

pred_spatiotemp a data.frame object containing the annual relative abundance indices by region
trajectory_samples a data.frame object containing n_trajectories sampled trajectories (time series).
posterior_mean a matrix containing posterior samples of $\hat{\mu}_{i,k}(s, t)$ (i.e. posterior samples of the mean of the counts - useful for posterior predictive checking using the Bayesplot package)
posterior_sample a matrix containing posterior predictive samples of $\hat{N}_{i,k}(s, t)$ (i.e. posterior sampled counts from the posterior predictive distribution - useful for posterior predictive checking using the Bayesplot package)
index_plot contains the ggplot2 object of the sampled indices
trajectory_plot contains the ggplot2 object of the sampled trajectories
preserve_inter_regional_differences a logical stating if the indices are comparable across subregions (if TRUE)
mod the returned INLA model fit.

Now we plot maps of yelloweye density from the three spatio-temporal models in the year 2000 using the function spatio_temporal_plot_fun() which takes arguments:

pred_df_plot the sf points (pixels) output named pred_df_plot from the function spatiotemp_censored_index_fun()
year integer specifying the desired year for plotting
variable character specifying which variable to plot (e.g. 'mean', 'sd', etc.,)
hires_COAST sf polygons object with the high resolution coastline
plot_figure Logical - Do you want to plot the figure?
return_figure Logical - Do you want to return the figure?

# plot the posterior mean yelloweye density on a map in year 2000
spatio_temporal_plot_fun(
  spatio_temporal_CPUE_plotdf, year=2000, variable='mean',
  hires_COAST=Coast_hires, plot_figure=T, return_figure=F
)

# plot the posterior mean yelloweye density on a map in year 2000
spatio_temporal_plot_fun(
  spatio_temporal_ICR_plotdf, year=2000, variable='mean',
  hires_COAST=Coast_hires, plot_figure=T, return_figure=F
)

# plot the posterior mean yelloweye density on a map in year 2000
spatio_temporal_plot_fun(
  spatio_temporal_censored_plotdf, year=2000, variable='mean',
  hires_COAST=Coast_hires, plot_figure=T, return_figure=F
)

# We don't run the models in the vignette as INLA crashes GitHub Actions
# Instead, we load pre-made files here
ggplot2::ggplot(spatio_temporal_CPUE_inddf, ggplot2::aes(x=year ,y=mean, ymin=q0.025, ymax=q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
        ggplot2::ggtitle(paste0('Spatiotemporal Overdispersed Poisson Index ','Yelloweye'))

ggplot2::ggplot(spatio_temporal_ICR_inddf, ggplot2::aes(x=year ,y=mean, ymin=q0.025, ymax=q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
                ggplot2::ggtitle(paste0('Spatiotemporal Overdispersed ICR-based Poisson Index ','Yelloweye'))

ggplot2::ggplot(spatio_temporal_censored_inddf, ggplot2::aes(x=year ,y=mean, ymin=q0.025, ymax=q0.975)) +
        ggplot2::geom_point() + ggplot2::geom_errorbar() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
        ggplot2::ggtitle(paste0('Spatiotemporal Overdispersed Censored Poisson Index yelloweye'), subtitle = 'censorship proportion 0.9, censorship from data in upper 10% of values removed')

ggplot2::ggplot(spatio_temporal_CPUE_trajectory, ggplot2::aes(x=year ,y=mean, colour=MC_ind, group=MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
          ggplot2::ggtitle(paste0('Overdispersed',' Poisson Index ','yelloweye')) +
          viridis::scale_color_viridis()

ggplot2::ggplot(spatio_temporal_ICR_trajectory, ggplot2::aes(x=year ,y=mean, colour=MC_ind, group=MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
          ggplot2::ggtitle(paste0('Overdispersed ICR-based',' Poisson Index ','yelloweye')) +
          viridis::scale_color_viridis()

ggplot2::ggplot(spatio_temporal_censored_trajectory, ggplot2::aes(x=year ,y=mean, colour=MC_ind, group=MC_ind)) +
          ggplot2::geom_line() + ggplot2::ylab('Catch rate index') + ggplot2::facet_grid(~region) +
          ggplot2::ggtitle(paste0('Overdispersed censored',' Poisson Index ','yelloweye'),
                  subtitle = 'Censorship defined as 95% or more baits removed \nCensorship removed from upper 10% of values each year') +          viridis::scale_color_viridis()

# Plot the pre-made files
ggplot2::ggplot() + inlabru::gg(spatio_temporal_CPUE_plotdf[which(spatio_temporal_CPUE_plotdf$year==2000), c('mean')]) + inlabru::gg(sf::as_Spatial(Coast_hires)) + colsc(as.matrix(spatio_temporal_CPUE_plotdf[which(spatio_temporal_CPUE_plotdf$year==2000), c('mean')])) + ggplot2::ggtitle('CPUE-based yelloweye abundance in year 2000')

ggplot2::ggplot() + inlabru::gg(spatio_temporal_ICR_plotdf[which(spatio_temporal_ICR_plotdf$year==2000), c('mean')]) + inlabru::gg(sf::as_Spatial(Coast_hires)) + colsc(as.matrix(spatio_temporal_ICR_plotdf[which(spatio_temporal_ICR_plotdf$year==2000), c('mean')])) + ggplot2::ggtitle('ICR-based yelloweye abundance in year 2000')

ggplot2::ggplot() + inlabru::gg(spatio_temporal_censored_plotdf[which(spatio_temporal_censored_plotdf$year==2000), c('mean')]) + inlabru::gg(sf::as_Spatial(Coast_hires)) + colsc(as.matrix(spatio_temporal_censored_plotdf[which(spatio_temporal_censored_plotdf$year==2000), c('mean')])) + ggplot2::ggtitle('Censored Poisson yelloweye abundance in year 2000',subtitle = 'Censorship defined as 95% or more baits removed \nCensorship removed from upper 10% of values each year')

Note that we can also fit spatiotemporal models in sdmTMB using spatio_temporal_censored_ind_sdmTMB(). It takes the same arguments with the exception of priors.

spatio_temporal_CPUE_ind_sdmTMB <-
  spatiotemp_censored_index_fun_sdmTMB(data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=1.1, keep=T, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, seed=0, mesh=mesh, pixels=predict_pixels, covs=NULL, spatiotemporal='AR1', verbose = T, n_trajectories = 10, preserve_inter_regional_differences = F, grf_priors = sdmTMBpriors(
  matern_s = pc_matern(range_gt = 20, sigma_lt = 2),
  matern_st = pc_matern(range_gt = 20, sigma_lt = 2)),
  time_effect = 'unstructured')


spatio_temporal_ICR_ind_sdmTMB <-
  spatiotemp_censored_index_fun_sdmTMB(data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=T, cprop=1.1, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1.1, plot=T, allyears=F, station_effects=T, seed=0, mesh=mesh, pixels=predict_pixels, covs=NULL, spatiotemporal='AR1', verbose = T, n_trajectories = 10,
  preserve_inter_regional_differences = F, grf_priors = sdmTMBpriors(
  matern_s = pc_matern(range_gt = 20, sigma_lt = 2),
  matern_st = pc_matern(range_gt = 20, sigma_lt = 2)),
  prev_fit = spatio_temporal_CPUE_ind_sdmTMB$mod,
  time_effect = 'unstructured')

spatio_temporal_censored_ind_sdmTMB <-
  spatiotemp_censored_index_fun_sdmTMB(data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, keep=F, use_upper_bound=TRUE, upper_bound_quantile=0.95, plot=T, allyears=F, station_effects=T, seed=0, mesh=mesh, pixels=predict_pixels, covs=NULL, spatiotemporal='AR1', verbose = T, n_trajectories = 10,
  preserve_inter_regional_differences = F, grf_priors = sdmTMBpriors(
  matern_s = pc_matern(range_gt = 20, sigma_lt = 2),
  matern_st = pc_matern(range_gt = 20, sigma_lt = 2)),
  prev_fit = spatio_temporal_CPUE_ind_sdmTMB$mod,
  time_effect = 'unstructured')

# Remove the upper bound and upper quantile
spatio_temporal_censored_ind_sdmTMB2 <-
    spatiotemp_censored_index_fun_sdmTMB(data=Reduced_data_sp %>% filter(year>=1998), survey_boundaries=survey_boundaries, species='yelloweye', M=n_samp, return=T, ICR_adjust=F, cprop=0.95, keep=F, use_upper_bound=FALSE, upper_bound_quantile=1, plot=T, allyears=F, station_effects=T, seed=0, mesh=mesh, pixels=predict_pixels, covs=NULL, spatiotemporal='AR1', verbose = T, n_trajectories = 10,
                                         preserve_inter_regional_differences = F,
                                         grf_priors = sdmTMBpriors(
                                             matern_s = pc_matern(range_gt = 20, sigma_lt = 2),
                                             matern_st = pc_matern(range_gt = 20, sigma_lt = 2)),
                                         prev_fit = spatio_temporal_CPUE_ind_sdmTMB$mod,
  time_effect = 'unstructured')

pbs-assess/hookCompetition documentation built on April 27, 2023, 12:47 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pbs-assess/hookCompetition
An R package for estimating relative abundance indices from longline data using a censored likelihood approach

In pbs-assess/hookCompetition: An R package for estimating relative abundance indices from longline data using a censored likelihood approach

How to use the hookCompetition package

Step 1 - Setting parameters for generating the Rmarkdown document

Step 2 - Loading the data.

Step 3 - Fit the indices without the spatio-temporal random effects.

Bonus

Step 6 Fit the spatio-temporal indices

R Package Documentation

Browse R Packages

We want your feedback!

pbs-assess/hookCompetition An R package for estimating relative abundance indices from longline data using a censored likelihood approach

In pbs-assess/hookCompetition: An R package for estimating relative abundance indices from longline data using a censored likelihood approach

How to use the hookCompetition package

Step 1 - Setting parameters for generating the Rmarkdown document

Step 2 - Loading the data.

Step 3 - Fit the indices without the spatio-temporal random effects.

Bonus

Step 6 Fit the spatio-temporal indices

R Package Documentation

Browse R Packages

We want your feedback!

pbs-assess/hookCompetition
An R package for estimating relative abundance indices from longline data using a censored likelihood approach