knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 10,
  fig.height = 8
)

This vignette shows how to extract and analyse the data for predators of Haida Gwaii Pacific Herring, including lumping species together (e.g. skates_combined). This information is for a DFO Ecosystem Approach to Fisheries Management Case Study on ecoystem influences of Haida Gwaii Pacific Herring.

The area of interest is the same as shown in the Restricted area vignette, and the shapefile is saved in the package as HG_herring_pred_area.

library(gfiphc)
path_data = "HG-herring-predators-data"        # folder to save extracted raw data
path_results = "HG-herring-predators-results"  # folder to save results

Data for species of interest

These are the species (herring predators, excluding skate species) identified by Jennifer Boldt:

sp_vec <- c("yelloweye rockfish",
            "arrowtooth flounder",
            "lingcod",
            "north pacific spiny dogfish",
            "pacific cod",
            "pacific halibut",
            "redbanded rockfish",
            "sablefish",
            "walleye pollock")

We want a general rockfish category and general skates one. I just picked yelloweye and redbanded for rockfish for now.

Skate species

Unidentified (species of) skates were enumerated in the early years, but then declined dramatically and had zero counts in 2001, 2002 and 2013, and presumably for 2003 onwards (but those data are in GFbio which would have to be checked -- seems clear that the identification protocol improved through time). So after investigating this (in earlier versions of this vignette), am now including unidentified skate in the species list.

Looking through 2019 synopsis report and the plots below (and a look at the data up to 2020; not doing again for 2022), here are the decisions on including each skate species:

skate_sp <- c("aleutian skate",
              "big skate",
              "roughtail skate",
              "sandpaper skate",
              "longnose skate",
              "alaska skate",
              "unidentified skate")

skate_ignore <- c("abyssal skate",
                  "broad skate")

Cache the data if needed:

cached_data = TRUE     # whether or not to use already-saved data, set FALSE
                       #  for first run each year then TRUE
if(!cached_data){
  cache_pbs_data_iphc(sp_vec,
                      path = path_data)
  cache_pbs_data_iphc(skate_sp,
                      path = path_data)
  cache_pbs_data_iphc(skate_ignore,
                      path = path_data)
}

Individual skate species

Calculate series for each skate species in turn (saved as species_name_area) and plot figures of Series E and F and longest from those. The fourth panel has Series EF which is the longest time series possible; if it is blank then Series E or F are the longest (and automatically identified). But this only happens for the rare species, and so not when the skate species are combined.

cached_results = TRUE   # whether or not to use already-saved results (FALSE for
                        # first run)
for(sp in skate_sp){
  assign(sp_hyphenate(sp,
                      underscore = TRUE,
                      area = TRUE),
         iphc_get_calc_plot_area(sp,
                                 cached_data = cached_data,
                                 cached_results = cached_results,
                                 path_data = path_data,
                                 path_results = path_results))
}

Unidentified skate

Above figure shows that for 1995 and 1996 there are 'Unident. skate', and then for a few years after that (but at lower numbers). Can see from species figures above that indeed no skate species were identified for those years. And it looks like then big and longnose were identified but not others for a few years.

First looking into the raw data:

# 1995
unique(countData1995$spNameIPHC)[grep("Skate", unique(countData1995$spNameIPHC))]
grep("skate", unique(countData1995$spNameIPHC)) # none
# And a manual check through confirms, that all we need is "unident. Skate"

#1996 to 2002 has "unident. Skate", but only up to 2000:
unique(data1996to2002$spNameIPHC)[grep("Skate",
                                       unique(data1996to2002$spNameIPHC))]
grep("skate", unique(data1996to2002$spNameIPHC)) # none
dplyr::filter(data1996to2002,
              spNameIPHC == "unident. Skate") %>%
  tail()

# And there were less unidentified through time:
dplyr::filter(data1996to2002,
              spNameIPHC == "unident. Skate") %>%
  dplyr::select(year) %>%
  table()


# 2013 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'):
unique(countData2013$spNameIPHC)[grep("Skate", unique(countData2013$spNameIPHC))]
grep("skate", unique(countData2013$spNameIPHC)) # none

# 2003-2019 (except 2013) - in GFbio
# See Issue #16 - not sure what the species code would be, but there may not be
# one if they were all identified after 2000.

# 2020 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'):
unique(countData2020$spNameIPHC)[grep("Skate", unique(countData2020$spNameIPHC))]
grep("skate", unique(countData2020$spNameIPHC)) # none

# 2021 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'):
unique(countData2021$spNameIPHC)[grep("Skate", unique(countData2021$spNameIPHC))]
grep("skate", unique(countData2021$spNameIPHC)) # none

# 2022 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'):
unique(countData2022$spNameIPHC)[grep("Skate", unique(countData2022$spNameIPHC))]
grep("skate", unique(countData2022$spNameIPHC)) # none

So, given none in 2001 or 2002 and numbers of unidentified declined from 1996 to 2000, it's probably the case that they were all identified from 2001 onwards. See Issue #16.

So, we can use unidentified skate as a species name (done above). Load in and examine that data:

unidentified_skate_area$sp_set_counts_with_area %>%
  dplyr::select(year) %>%
  table()

But there no positive counts after 2000, looking at first 20 hooks (Series E) and then all hooks (Series F).

dplyr::select(unidentified_skate_area$ser_E_and_F$ser_E,
              year,
              num_pos20)
dplyr::select(unidentified_skate_area$ser_E_and_F$ser_F,
              year,
              num_pos)

Merge together skate species

Now want to lump the skates in skate_sp together. Note that species counts get named as N_it.1 etc. While each one has NA's, none of the summed N_it_sum or N_it20_sum do, because get_combined_species() ignores NA's when making the sums. This is always recalculated (in case earlier cached data have changed).

sp <- "skates combined"
skates_combined <- get_combined_species(sp_vec = skate_sp,
                                        path = path_data,
                                        save_RDS_name = sp_hyphenate(sp))
summary(skates_combined$set_counts)

# How many NA's in certain columns (none for N_it_sum or N_it20_sum
skates_combined$set_counts %>%
  dplyr::select(-c("station", "lat", "lon", "usable", "standard")) %>%
  dplyr::summarise_all( ~ sum(is.na(.x))) %>%
  t()

For 2021: these correctly change from the 2020 values (the 20-hook values don't have more NA's than in 2020, but the all-hook values do, since we have no all-hook data in 2021), except for N_it20.4 for Sandpaper Skate, which had 1500 NA's in 2020 version but only 244 (same as some other species) in 2021 version. Looking at the relevant figure above, this is because 1997-2002 and 2013 are now zero when before they were NA, due to 2021 being the first time this species was seen in 20-hook data (and how the data get rolled up). Not checking for 2022.

Note that by simply adding the counts of each skate species, we are implicitly assuming that selectivity is the same across skate species (kind of?). So true changes in relative abundance may not be accurately reflected in changes in the summed index.

Now use to calculate Series E, F and EF calculations for the restricted area:

assign(sp_hyphenate(sp, underscore = TRUE, area = TRUE),
       iphc_get_calc_plot_area(sp,
                               cached_data = cached_data,
                               cached_results = cached_results,
                               path_data = path_data,
                               path_results = path_results))

Individual non-skate species

Similarly for the other species:

for(sp in sp_vec){
  assign(sp_hyphenate(sp, underscore = TRUE, area = TRUE),
         iphc_get_calc_plot_area(sp,
                                 cached_data = cached_data,
                                 cached_results = cached_results,
                                 path_data = path_data,
                                 path_results = path_results))
}

Format for Jennifer's EAFM summary

Save annual indices (with uncertainty) for all species of interest, in the format for the HG herring predators' EAFM summary (some 2020 versions are in for-jen-earlier-versions/, each later year will be saved along with everything else in vignettes-results-20**-data/.

all_sp_longest <- list()
sp_vec_and_combined_skates <- c(sp_vec, "skates combined")
# Do this to easily create arguments for iphc_format_for_EAFM(); can't easily
#  functionalise. Each element is names for the species, and is a tibble of
#  longest time series.
for(sp in sp_vec_and_combined_skates){
  all_sp_longest[[sp]] <- get(paste0(sp_hyphenate(sp,
                                                  underscore = TRUE,
                                                  area = TRUE)))$series_longest$ser_longest
}

eafm_res <- iphc_format_for_EAFM(sp_vec_and_combined_skates,
                                 path = path_results,
                                 all_sp_longest) # also saves .csv, and .rds of
                                                 #  all_sp_longest list

eafm_res

Note that the analyses may not yet fully account for all possibilities (for all areas and species), particularly for rarer species. So do check that the output is sensible by inspecting the above time series figures. And the analyses do not (yet) include hook competition, which is currently being investigated.



pbs-assess/gfiphc documentation built on July 4, 2023, 1:13 p.m.