# knitr options knitr::opts_chunk$set( collapse = TRUE, warning = FALSE, message = FALSE, echo = TRUE, eval = FALSE, comment = "#>" ) # turn off one check options(rmarkdown.html_vignette.check_title = FALSE)
The core of PITcleanr
's functionality relies on detections from PTAGIS, a configuration file that maps those detections onto user-defined "nodes", and a parent-child table showing how those nodes are related to each other on a stream network (i.e. which nodes are connected). PITcleanr
contains several functions to help create the configuration file and the parent-child table, but the user can also create those files by hand, or use files from a previous analysis (e.g. last year). This vignette shows only the essential steps needed to make use of PITcleanr
in preparing data for the Dam Adult Branch Occupancy Model (DABOM).
library(PITcleanr) library(readr) library(dplyr)
If the user has file containing the complete tag histories for their tags of interest, plus the configuration file (config_file
) and the parent-child file (parent_child_file
) ready, the only script necessary to run in R (which will include saving the output as an Excel workbook) is:
# load the necessary packages library(PITcleanr) library(readr) library(dplyr) # read in configuration file, config_file = path to that file my_config = read_csv(config_file) # read in parent_child file, parent_child_file = path to that file parent_child = read_csv(parent_child_file) %>% addParentChildNodes(configuration = my_config) # read in PTAGIS observations (ptagis_file), site configuration, and parent-child, prepped_df = prepWrapper(ptagis_file = ptagis_file, configuration = my_config, parent_child = parent_child, # filter out detections after a particular date max_obs_date = "20180930", # save results to file save_file = T, file_name = "PITcleanr_output.xlsx")
For additional details on site configuration, parent-child tables, and using the prepWrapper()
function, continue reading.
The user may want to start with a configuration file queried from PTAGIS, or generate their own. If they choose to generate their own, it must contain the following columns (at a minimum):
site_code
: this is the site code associated with a detection site in PTAGISconfig_id
: this is the configuration sequence from PTAGIS associated with a specific antenna arrangement for that siteantenna_id
: this is the antenna code or ID from PTAGISnode
: this is the user-defined node that detections are mapped to i.e., how would the user like to group detections on a specific antenna, in a specific configuration, and at a specific site?The user may want to include other columns such as site name, latitude/longitude, river kilometer (RKM), etc. but none of those are strictly necessary.
The user can start with a template by downloading all the metadata associated with each detection site in PTAGIS using the following:
config_template = queryPtagisMeta() %>% select(site_code, # site_name, config_id = configuration_sequence, # antenna_group = antenna_group_name, antenna_id) %>% # set each node to NA to start mutate(node = NA_character_)
For a typical DABOM model, where nodes are generally defined as arrays, the user may wish to start with a configuration file that has nodes assigned to each antenna, based on the array the antenna is part of, and merely edit some rows.
config_template = buildConfig() %>% select(site_code, site_name, config_id, antenna_group, antenna_id, node)
The user may save this template to a .csv file, and then edit it by hand, deleting sites they are not interested in, and filling in the node
information for all other rows they are interested in.
# tell R where to save the csv file config_file = "configuration.csv" # save the template as a csv file write_csv(config_template, file = config_file)
Once the user has finished editing it, they may read it back into R:
# find configuration file config_file = system.file("extdata", "TUM_configuration.csv", package = "PITcleanr", mustWork = TRUE)
# read in configuration file my_configuration = read_csv(config_file, show_col_types = F)
The parent-child table contains information about how sites or nodes are connected on the stream network. Each site/node child has a single parent, defined as the site/node most directly downstream of the child. Each parent may have multiple children, as the stream network may branch upstream of the parent. For additional information, see that section in this vignette.
If the user wishes to create a parent-child table by hand, the file must contain at a minimum the columns parent
and child
. This file can then be read into R.
# find parent-child file parent_child_file = system.file("extdata", "TUM_parent_child.csv", package = "PITcleanr", mustWork = TRUE)
# read in parent_child file parent_child = read_csv(parent_child_file, show_col_types = F)
The path to an example parent-child file included with PITcleanr
can be found by running the following. The user could copy/paste that file to use as a template. Again, only the parent
and child
columns are necessary, although other columns may be useful.
# path to example parent-child table system.file("extdata", "TUM_parent_child.csv", package = "PITcleanr", mustWork = TRUE)
The user may choose to only include site codes in the parent and child columns, and then use their configuration file to build out the parent child table to include nodes. To add nodes, use the addParentChildNodes
function:
parent_child_nodes = addParentChildNodes(parent_child = parent_child, configuration = my_configuration)
Finally, the user must download complete tag histories from PTAGIS for the tags of interest, ensuring that the following attributes are selected to be included in the output:
This next group of attributes are not required, but are highly recommended:
For further information about querying data from PTAGIS (or other sources), please read the vignette about reading in data.
# find file with PTAGIS observations ptagis_file = system.file("extdata", "TUM_chnk_cth_2018.csv", package = "PITcleanr", mustWork = TRUE)
Now the user can assign those detections to a "node", compress that data, assign directionality and determine which observations should be retained for DABOM, all using the prepWrapper()
function. The min_obs_date
argument will filter out observations prior to that date, while the max_obs_date
argument will ensure that all detections after that date are not retained.
# read in PTAGIS observations and compress them, # then prep data for choice of which detections to keep prepped_df = prepWrapper(cth_file = ptagis_file, configuration = my_configuration, parent_child = parent_child_nodes, min_obs_date = "20180301", max_obs_date = "20180930")
The user can even save that output directly to an Excel workbook or .csv file by setting save_file = TRUE
and defining a file path and name in the file_name
argument. Within this output (either in R, or in the saved Excel file), all the rows where the field user_keep_obs
is NA or blank must be filled in with either TRUE
or FALSE
. The field auto_keep_obs
provides a suggestion, but the ultimate choice is up to the user.
prepped_df = prepWrapper(ptagis_file = ptagis_file, configuration = my_config, parent_child = parent_child_nodes, min_obs_date = "20180301", max_obs_date = "20180930", save_file = T, file_name = "PITcleanr_output.xlsx")
The prepWrapper()
function also contains an option to add a column showing a character string of all the nodes a tag was detected at, in chronological order. Some users may find this display useful in determining which detections to retain. This option can be invoked by setting the argument add_tag_detects = TRUE
within the prepWrapper()
function.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.