In noaa-garfo/discaRd: Cochran bycatch estimation

knitr::opts_chunk$set(echo = TRUE)

library(tidyverse)
library(odbc)
library(ROracle)
library(dplyr, warn.conflicts = FALSE)

# # unlock keyring
# keyring::keyring_unlock("apsd_ma")
# 
# # local run
# # dw_apsd <- config::get(value = "apsd", file = "K:/R_DEV/config.yml")
# 
# # if on server..
# # dw_apsd <- config::get(value = "maps", file = "~/config.yml")
# 
# 
# # connect to MAPS
# 
# con_maps = apsdFuns::roracle_login(key_name = 'apsd_ma', key_service = 'maps')

Sys.setenv(TZ = "America/New_York")
Sys.setenv(ORA_SDTZ = "America/New_York")

if(file.exists(here::here("vignettes/articles", "pw.csv"))) {
  pw_df <- readr::read_csv(here::here("vignettes/articles", "pw.csv"), col_types = "c")
  pw <- pw_df$pw
} else {
  pw <- rstudioapi::askForPassword("Enter keyring password")
}

keyring::keyring_unlock("apsd_ma", password = pw) # will prompt for password
con_maps <- apsdFuns::roracle_login(key_name = 'apsd_ma', key_service = 'maps')

Summary

This is the main observer and catch matching table used for CAMS discard estimation.

Description

OUTPUT TABLE: CAMS_OBS_CATCH
YEARS: 2017+
RESOLUTION: VTRSERNO (subtrip)~LINK1
DEVELOPMENT LANGUAGE: SQL
CODE: https://github.com/NOAA-Fisheries-Greater-Atlantic-Region/discaRd/tree/model_estimator/CAMS

Data Sources

CAMS_LANDINGS (GARFO)
CAMS prorated observer data
- NEFOP (NEFSC)
- ASM (NEFSC)

DiagrammeR::mermaid("
  graph LR
  NEFOP --> CAMS_OBDBS_YYYY
  ASM --> CAMS_OBDBS_YYYY
  SECGEAR_MAPPED --> CAMS_OBS_CATCH
  CAMS_LANDINGS --> CAMS_OBS_CATCH
  CAMS_OBDBS_YYYY --> CAMS_OBS_CATCH
        ")

Approach

The use of a combined catch and observation table allows for a single source table to be used in discard estimation. Previous methods took a two table approach, where catch information and observer records were stratified independently, and then matched to calculate discard (e.g., $D = K*d/k$). This approach had the possibility of mismatches between observed strata and trip strata. In reality, this cannot occur. We therefore do trip by trip matching, using gear, meshgroup, statistical area, and LINK1, to match observer recorded species discards with commercial trip activity.

The primary driver for this approach was to use the trip recorded metrics as the stratification source. This reduces the possibility of mismatches and removes much of the hard-coding of the past. Upfront matching also allows observed discards to easily be used as the official discard for a particular trip. Furthermore, we recognize that data errors either from the catch, or observer data, will result in a non-match. This likely reduces the total pool of observed trips that are being used, but we feel the benefits of using this approach outweigh a reduced sample size. Quality control of these data are outside the purview of the CAMS project. Last, we only use observed trips (LINK1) where valid LINK3 (hauls) occurred to alleviate issues of multiple LINK1 records for a single subtrip.

The matching occurs in a staged manner. All commercial trips with a non-null LINK1 field get a value for how many unique VTR serial numbers are associated with it. The vast majority of observed trips have a single VTR, and only require matching by LINK1. For multiple VTR trips, a match as described above is used (gear, meshgroup, statistical area, and LINK1). Further, we only match on statistical area when more than one area is reported for a trip. This reduces data loss from observer recorded area mismatches to VTR recorded area.

meshgroup has been defined several ways depending on the data stream. CAMS initiated a subgroup which determined the most pragmatic definition for meshgroup based on clustering of recorded mesh sizes in both observer and cacth records.

All nets:

small : < 3.99 (inches)
large : >= 4.00 (inches)

Gill Nets:

extra large >= 8 (inches)

gear groupings for matching purposes required a mapping of NEGEAR codes from observed and commercial trips to VTR three-character gear codes (e.g. DRS, PTO, etc). The relationship in our database tables (VLGEAR, FVTR_GEAR) are many-to-many and do not map easily. Furthermore, there are several NEGEAR codes in VTR that do not occur in observer records, and vice versa. Therefore, a support table, MAPS.CFG_OBS_VTR_GEARMAP was constructed to facilitate gear matching.

Prorated discard

Following the standard approach taken for groundfish quota monitoring, observed discards on unobserved hauls are prorated within a subtrip. This is done by applying a ratio of kept all on the entire trip to kept all on the unobserved hauls only:

$$d_{total} = d_{observedhauls}*(1+KALL_{unobserved hauls}/KALL_{subtrip})$$

IMPORTANT!: The CAMS_OBS_CATCH table is a hybrid. For trips that were not observed, there will be a single row, with all trip metrics, and a total KALL per subtrip. When a trip was observed, there are multiple rows, where the trip metric information is repeated and each row shows species, discarded amount, and other observer recorded information for each row. Total KALL CANNOT be calculated without filtering rows by LINK1 to indicate an observed trip or not. These steps are outlined in subsequent R modules used to run discaRd.

Data Dictionary

cflen_comments <- ROracle::dbGetQuery(con_maps, "
SELECT a.COLUMN_NAME, a.COMMENTS, b.data_type 
FROM all_col_comments a,  all_tab_cols b
WHERE a.owner = 'MAPS' 
AND a.table_name = 'CAMS_OBS_CATCH'
AND a.COLUMN_NAME = b.COLUMN_NAME
AND b.table_name = 'CAMS_OBS_CATCH'
"
                                                                            )
cflen_comments %>%
    dplyr::select(
        COLUMN = COLUMN_NAME,
        DESCRIPTION = COMMENTS
        , `DATA TYPE` = DATA_TYPE
    ) %>%
    knitr::kable(booktabs = TRUE) %>%
    kableExtra::kable_styling(bootstrap_options = c("striped", 
                                                                                                    "hover", 
                                                                                                    "condensed", 
                                                                                                    "responsive"),
                                                        full_width = FALSE,
                                                        position = "left",
                                                        fixed_thead = TRUE,
                                                        font_size = 10)