In ryankinzer/PITviewr: Summarize and View PIT-tag Data

knitr::opts_chunk$set(eval=FALSE, echo=TRUE, warning=FALSE, message=FALSE)

Load Package

The current beta version of the PITviewr R package is stored in Ryan Kinzer's GitHub repository. The repository was originally forked from Mike Ackerman's working repository, if the new additions are acceptable, they will be accepted into Mike's version for production. To obtain the beta version first install Hadley Wickam's devtools from R CRAN. The function install_github() will retrieve the latest copy of DABOM from GitHub and install it to a local R library.

install.packages('devtools')
devtools::install_github("ryankinzer/DABOM")

After installing DABOM it is necessary to load the package into the current R session with library('DABOM') before using any built in functions. The package also requires the installation of R packages dplyr, lubridate and RODBC to operate.

library(tidyverse)
library(DABOM)

To prepare DABOM input data it's first necessary to save the Lower Granite Trapping data to a 'local' folder on your computer. The data can be retreived from the "QCI Downloads" folder on the Idaho Fish and Game's IFWIS website. The trapping data is stored in an Microsoft Access database that is updated weekly and named "LGTrappingExportJodyW.accdb". Permission must be obtained from Idaho Fish and Game before viewing, accessing and retreiving data from the site. To make the package run without having access to the database an example dataset is provided with the package, "chinook15_trapdata".

If the full Access database is saved to a 'local' folder, the commented out LGTrapData() can be used to retrieve all trapping data matching the selected species and spawn year.

#------------------------------------------------------------------------------
# load trap data
#------------------------------------------------------------------------------
View(chinook15_trapdata)
# chinook15_trapdata <- LGTrapData(filename = 'dbase_path',
#                        species = 'Chinook',
#                        spawnyear = '2015')

The LGTrapData() function opens an ODBC connection with the database using the RODBC package. After the connection is made, the database is queried for only those records matching the selected species (Chinook or Steelhead), spawn year and having a lifestage designation of 'RF' for returning fish. The function returns an R data frame object.

The next function, validTags(), subsets the data frame returned from LGTrapData() to individual fish that are considered vaild for the branching model input. The supplied data frame is trimmed by three main fields; LGDValid, LGDMarkAD, and LGDNumPIT. Where valid records for model processing contain LGDValid = 1, LGDMarkAD = 'AI' and LGDNumPIT is not blank. In addition, if the supplied trap data contains Chinook records, only those records with a run type of '5' in the SRR field are kept.

#------------------------------------------------------------------------------
# select valid tag records only
# if you don't have access to the database, an example data set is installed
# with the package
# you could load RO's original tag list, but you would also need to change
# a couple field names.
#------------------------------------------------------------------------------
valid_tags <- validTags(trapdata = chinook15_trapdata)

The valid tag numbers now need to be exported into a '.txt' file for loading into PTAGIS to perform a 'Complete Tag History' query (in the future maybe we can write a function to do this step through R using DART's or PTAGIS's open API). We only need to export the contents of the TagID field which contains the unique PIT-tag codes and save them to your working directory.

tag_codes <- valid_tags$TagID
# write.table(tag_codes, file = '.../tag_codes.txt',quote = FALSE, sep = '\t',
#               row.names = FALSE, col.names = FALSE)

After logging into PTAGIS, select the "Data" tab, "Advanced Reporting" and then the "Launch" button. Now, select the "New Query Builder2 Report" icon and the "Complete Tag History" option. The DABOM functions require the following attributes to be selected; Tag, Event Date Time, Event Site Code, Antenna, and Antenna Group Configuration. Other attributes can be selected and they can appear in any order. After selecting attributes, upload tag_codes.txt in the 'Tag Code - List or Text File' section and then run the report and save the output. Next read the outputted observation file into R or use the example observation file, "chinook15_obs".

#------------------------------------------------------------------------------
# load tag observation file
#------------------------------------------------------------------------------
View(chinook15_obs)
# Examples to read in the observation data
# chinook15_obs <- readxl::read_excel('filepath')
# chinook15_obs <- readr::read_csv('filepath')

Now load the two configuration files to assign model nodes to each record in the observation file and to build a table of of all acceptable fish pathways. Example datasets are provided for the site-configuration file and the parent-child table.

#------------------------------------------------------------------------------
# load configuration and parent-child files
#------------------------------------------------------------------------------
View(config)
View(parentchild)

Running the function nodeAssign() next will complete two tasks. First, the function assigns model nodes based on the unique combinations of SiteID, AntennaID and ConfigID listed in the configuration file. If a node is not identified in the cofiguration file, the function will not terminate but an 'ERROR' will appear in the "Node" field of the observation table. The second task is completed if truncate = TRUE is selected, which will remove all tag observation records that occur prior to the individuals capture at the Lower Granite trap or occuring at invalid nodes. In addition truncate = TRUE, will only keep the record with a minimum observation out of a group of observations occuring at a single node during the same observation event (i.e., multiple detections at a single node that are not interupted with a detection at a different node). Truncating the data to only the essential observation records necessary for the branch model greatly reduces the processing time for validating the travel paths of each individual fish at a later step, however, this function still requires some computing time because of a for loop to identify minimum dates. (Maybe we can change this later.)

valid_obs_dat <- nodeAssign(valid_tags = valid_tags, 
                            observation = chinook15_obs,
                            configuration = config, 
                            truncate = TRUE)

The following function, validPaths(), uses the parent-child table and maps all available fish paths from a single parent (i.e., path root) to each node. The function then builds a character string which labels all nodes a fish would past to reach the observed location. The function was created by Greg Kliewer and originally worked from querying a SQLite database which housed the data.

#------------------------------------------------------------------------------
# create a data frame with all valid paths identified in the parent child table
valid_paths <- validPaths(parent_child = parentchild)

We now have a data frame with the minimum number of valid observations which need to be checked against all the valid fish pathways. Two functions exist to check the observations and label them as acceptable observations or fish routes to the spawning grounds or as observations needing to be checked by the biologist.

The first function was originally created by Greg Kliewer and worked within a SQLite database. I followed the steps/logic that Greg developed and mimicked his code and output using only R commands, thus, eliminating the need for the SQLite backend. (Takes a little time because it loops through all the obs and then writes a call to a dataframe one obs at a time.)

#------------------------------------------------------------------------------
# Greg's original process and logic
#------------------------------------------------------------------------------
fish_obs <- fishPaths(valid_obs_dat, valid_paths)

# fish_obs %>%
#   filter(UserProcStatus == '') %>%
#   summarise(n_distinct(TagID))

The second function identifies the direction of travel for each individual fish by checking the current and previous node locations against the valid path strings. In addition, if the previous node is not detected in the current path string the observation is labeled as invalid (i.e., detects fish being observed in two differnt paths or tributaries). Then only upstream moving records with the minimum observation date at each unique node occuring prior to the observed node with the highest upstream order are identified as "ModelObs = TRUE".

# figure out if fish movement is upstream or downstream, and in the same path
# then truncate to only model observations
# RK and Rick thought process.
fish_obs2 <- spawnerPaths(valid_obs_dat, valid_paths)

# fish_obs2 %>%
#   filter(is.na(Direction)) %>%
#   summarise(n_distinct(TagID))

Finally, the results of both cleaning functions can be written to the working directory or desired file path using base R functions.

write.csv(fish_obs, file = './data/Data_Output/fishPaths_fnc_output.csv')
write.csv(fish_obs2, file = './data/Data_Output/spawnerPaths_fnc_output.csv')

# write_csv(fish_obs, path = '../data/Data_Output/fishPaths_fnc_output.csv')
# write_csv(fish_obs2, path = '../data/Data_Output/spawnerPaths_fnc_output.csv')