knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 10, fig.height = 8 )
This vignette shows how to extract and analyse the data for predators of Haida
Gwaii Pacific Herring, including lumping species together
(e.g. skates_combined
). This information is for a DFO Ecosystem Approach to
Fisheries Management Case Study on ecoystem influences of Haida Gwaii Pacific
Herring.
The area of interest is the same as shown in the Restricted
area vignette, and the shapefile is
saved in the package as HG_herring_pred_area
.
library(gfiphc) path_data = "HG-herring-predators-data" # folder to save extracted raw data path_results = "HG-herring-predators-results" # folder to save results
These are the species (herring predators, excluding skate species) identified by Jennifer Boldt:
sp_vec <- c("yelloweye rockfish", "arrowtooth flounder", "lingcod", "north pacific spiny dogfish", "pacific cod", "pacific halibut", "redbanded rockfish", "sablefish", "walleye pollock")
We want a general rockfish category and general skates one. I just picked yelloweye and redbanded for rockfish for now.
Unidentified (species of) skates were enumerated in the early years, but then declined dramatically and had zero counts in 2001, 2002 and 2013, and presumably for 2003 onwards (but those data are in GFbio which would have to be checked -- seems clear that the identification protocol improved through time). So after investigating this (in earlier versions of this vignette), am now including unidentified skate in the species list.
Looking through 2019 synopsis report and the plots below (and a look at the data up to 2020; not doing again for 2022), here are the decisions on including each skate species:
skate_sp <- c("aleutian skate", "big skate", "roughtail skate", "sandpaper skate", "longnose skate", "alaska skate", "unidentified skate") skate_ignore <- c("abyssal skate", "broad skate")
Cache the data if needed:
cached_data = TRUE # whether or not to use already-saved data, set FALSE # for first run each year then TRUE if(!cached_data){ cache_pbs_data_iphc(sp_vec, path = path_data) cache_pbs_data_iphc(skate_sp, path = path_data) cache_pbs_data_iphc(skate_ignore, path = path_data) }
Calculate series for each skate species in turn (saved as species_name_area
)
and plot figures of Series E and F and longest from those. The fourth panel has
Series EF which is the longest time series possible; if it is blank then Series
E or F are the longest (and automatically identified). But this only happens for
the rare species, and so not when the skate species are combined.
cached_results = TRUE # whether or not to use already-saved results (FALSE for # first run) for(sp in skate_sp){ assign(sp_hyphenate(sp, underscore = TRUE, area = TRUE), iphc_get_calc_plot_area(sp, cached_data = cached_data, cached_results = cached_results, path_data = path_data, path_results = path_results)) }
Above figure shows that for 1995 and 1996 there are 'Unident. skate', and then for a few years after that (but at lower numbers). Can see from species figures above that indeed no skate species were identified for those years. And it looks like then big and longnose were identified but not others for a few years.
First looking into the raw data:
# 1995 unique(countData1995$spNameIPHC)[grep("Skate", unique(countData1995$spNameIPHC))] grep("skate", unique(countData1995$spNameIPHC)) # none # And a manual check through confirms, that all we need is "unident. Skate" #1996 to 2002 has "unident. Skate", but only up to 2000: unique(data1996to2002$spNameIPHC)[grep("Skate", unique(data1996to2002$spNameIPHC))] grep("skate", unique(data1996to2002$spNameIPHC)) # none dplyr::filter(data1996to2002, spNameIPHC == "unident. Skate") %>% tail() # And there were less unidentified through time: dplyr::filter(data1996to2002, spNameIPHC == "unident. Skate") %>% dplyr::select(year) %>% table() # 2013 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'): unique(countData2013$spNameIPHC)[grep("Skate", unique(countData2013$spNameIPHC))] grep("skate", unique(countData2013$spNameIPHC)) # none # 2003-2019 (except 2013) - in GFbio # See Issue #16 - not sure what the species code would be, but there may not be # one if they were all identified after 2000. # 2020 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'): unique(countData2020$spNameIPHC)[grep("Skate", unique(countData2020$spNameIPHC))] grep("skate", unique(countData2020$spNameIPHC)) # none # 2021 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'): unique(countData2021$spNameIPHC)[grep("Skate", unique(countData2021$spNameIPHC))] grep("skate", unique(countData2021$spNameIPHC)) # none # 2022 (not in GFbio), no unident. Skate (checking 'skate' and 'Skate'): unique(countData2022$spNameIPHC)[grep("Skate", unique(countData2022$spNameIPHC))] grep("skate", unique(countData2022$spNameIPHC)) # none
So, given none in 2001 or 2002 and numbers of unidentified declined from 1996 to 2000, it's probably the case that they were all identified from 2001 onwards. See Issue #16.
So, we can use unidentified skate
as a species name (done above). Load in and
examine that data:
unidentified_skate_area$sp_set_counts_with_area %>% dplyr::select(year) %>% table()
But there no positive counts after 2000, looking at first 20 hooks (Series E) and then all hooks (Series F).
dplyr::select(unidentified_skate_area$ser_E_and_F$ser_E, year, num_pos20) dplyr::select(unidentified_skate_area$ser_E_and_F$ser_F, year, num_pos)
Now want to lump the skates in skate_sp
together. Note that species counts get
named as N_it.1
etc. While each one has NA's, none of the summed N_it_sum
or
N_it20_sum
do, because get_combined_species()
ignores NA's when making the
sums. This is always recalculated (in case earlier cached data have changed).
sp <- "skates combined" skates_combined <- get_combined_species(sp_vec = skate_sp, path = path_data, save_RDS_name = sp_hyphenate(sp)) summary(skates_combined$set_counts) # How many NA's in certain columns (none for N_it_sum or N_it20_sum skates_combined$set_counts %>% dplyr::select(-c("station", "lat", "lon", "usable", "standard")) %>% dplyr::summarise_all( ~ sum(is.na(.x))) %>% t()
For 2021: these correctly change from the 2020 values (the 20-hook values don't
have more NA's than in 2020, but the all-hook values do, since we have no
all-hook data in 2021), except for N_it20.4
for Sandpaper Skate, which had
1500 NA's in 2020 version but only 244 (same as some other species) in 2021
version. Looking at the relevant figure above, this is because 1997-2002 and
2013 are now zero when before they were NA, due to 2021 being the first
time this species was seen in 20-hook data (and how the data get rolled up). Not
checking for 2022.
Note that by simply adding the counts of each skate species, we are implicitly assuming that selectivity is the same across skate species (kind of?). So true changes in relative abundance may not be accurately reflected in changes in the summed index.
Now use to calculate Series E, F and EF calculations for the restricted area:
assign(sp_hyphenate(sp, underscore = TRUE, area = TRUE), iphc_get_calc_plot_area(sp, cached_data = cached_data, cached_results = cached_results, path_data = path_data, path_results = path_results))
Similarly for the other species:
for(sp in sp_vec){ assign(sp_hyphenate(sp, underscore = TRUE, area = TRUE), iphc_get_calc_plot_area(sp, cached_data = cached_data, cached_results = cached_results, path_data = path_data, path_results = path_results)) }
Save annual indices (with uncertainty) for all species of interest, in the
format for the HG herring predators' EAFM summary (some 2020 versions are in for-jen-earlier-versions/
,
each later year will be saved along with everything else in vignettes-results-20**-data/
.
all_sp_longest <- list() sp_vec_and_combined_skates <- c(sp_vec, "skates combined") # Do this to easily create arguments for iphc_format_for_EAFM(); can't easily # functionalise. Each element is names for the species, and is a tibble of # longest time series. for(sp in sp_vec_and_combined_skates){ all_sp_longest[[sp]] <- get(paste0(sp_hyphenate(sp, underscore = TRUE, area = TRUE)))$series_longest$ser_longest } eafm_res <- iphc_format_for_EAFM(sp_vec_and_combined_skates, path = path_results, all_sp_longest) # also saves .csv, and .rds of # all_sp_longest list eafm_res
Note that the analyses may not yet fully account for all possibilities (for all areas and species), particularly for rarer species. So do check that the output is sensible by inspecting the above time series figures. And the analyses do not (yet) include hook competition, which is currently being investigated.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.