knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Background

This vignette covers the data and database workflows of the Yolo Bypass Telemetry Project (YBT).

There are four types of data associated with the YBT data and database workflows: tagging data, detection data, deployment data, and shed tag/migratory fate data. All raw data for the database is contained in the YB_SQL_BaseTables subdirectory of the project folder on Dropbox. All templates for the data types are housed in the DatabaseTemplates subfolder of the SOPs subdirectory.

The database is already set up with past years' data, current through fish tagged in fall of 2019. This means that in general, the ongoing data and database workflow is: 1) collect/digitize new data, and 2) append that data to the database with specialized functions in the ybt R package.

Things to remember for all data and files:

  1. The column names of templates should not be changed without consultation, as much of the database building and cleaning code depends on these variable names.

  2. Data from past detection years is not necessarily housed or named in the same way as is now.

  3. All times should be in Pacific Standard Time, not clock time, and the format is YYYY-DD-MM HH:MM:SS. When entering dates and times in Excel, they must be entered in this format and converted to TEXT, as the sqlite database recognizes and expects them in this format.

Data Workflows

1. Deployments

The deployments table ensures that every detection in the Yolo Bypass detection database has a "home." That is, that no detections exist outside of the deployment window (DeploymentStart - DeploymentEnd) of some receiver within the array.

This role is very important, and thus recording the DeploymentStart date, in particular, is very important at the data stage.

Things to remember about deployment data:

  1. The Station names MUST match the official Station names used in the cleaning scripts. They are correct on the field template spreadsheet - just make sure they still match when you're digitizing the field datasheet.

  2. Each row of the .csv template spreadsheet must have a DeploymentStart date/time. In practice, what this means is that there may be NAs in many DeploymentEnd fields, especially after the first round of downloads for a detection year.

  3. A deployment starts when the receiver is actually submerged in the field for the first time, or when it is submerged after having just been downloaded. A deployment ENDs when the receiver is pulled out of the water before being downloaded, or when it is pulled for the season, or when it is swapped out with another receiver, i.e., it ends when that receiver cannot record any more detections for that deployment period.

Deployment Data Workflow:

  1. Prior to going out to complete downloads, check the .csv template for the deployments to make sure you're collecting all the necessary info. Digitize into the template upon return, saving the final digitized .csv with the field vrl naming conventions.

  2. In the field, follow the instructions in the Downloads SOP.

  3. After digitizing and double-checking your deployments .csv with your new deployment info, run the write_deployments() function with your .csv to append the new deployments to the database.

2. Tags

There are actually three tagging metadata tables in the .sqlite database: tags, chn, and wst. Each one has a template .csv in the SOPs/DatabaseTemplates/ folder, and these templates can be used to create a .csv that can be added to the database with the ybt_db_append() function.

Things to remember about tag data:

  1. When a tag is recovered and redeployed on a second fish in the same season, make sure to record its second deployment in the database tables (tags and chn) under a made-up numeric TagID (i.e. 7777), with the original (actual) transmitter code in the Comments field. These tags require special handling in the analysis workflow.

Tagging Data Workflow

  1. Update/populate the appropriate tagging .csv templates with new tagging info, and save with a new date-specific filename, i.e "chn_2020-12-31.csv" or "tags_2020-12-31.csv".

  2. Append new tagging data to the database with the ybt::ybt_db_append() function.

3. Detections

The detections are (of course) the hardest part of this workflow, and so they get their own vignette, which you can bring up with vignette(topic = "Database-detections", package = "ybt").

4. Shed and Receovered tags

The identification of shed tags is an ongoing process. The Sheds.xlsx and Recoveries.xlsx spreadsheets in the YB_SQL_BaseTables directory keep a record of these tags - we update them as we go. We have created two functions in the ybt package to help identify and handle detections from shed tags: ybt::identify_sheds() and ybt::truncate_sheds().

Shed and Recovered Tags Data Workflow:

  1. Update Recoveries.xlsx throughout the tagging season as recovery info comes in.


fishsciences/ybt documentation built on March 11, 2021, 8:45 a.m.