knitr::opts_chunk$set(echo = TRUE) library(telprep)
This vignette will show you how the telprep package can be used to automatically process data from telemetry flights. The program is designed to read in raw txt files, filter out erroneous signals, determine the living status of fish, and create some pretty pictures. This tutorial highlights the general workflow of the program. For a finer scale description of the program functionalities, see the help files for the following functions: read_flight_data, channels_merge, replace_date, combine_data, get_date_bins, rm_land_detects, get_best_detections, get_locations, flag_dead_fish, hmm_survival, and make_plot.
Begin by sticking the raw txt files into a folder. Txt files from multiple flights can and should be included in the folder. When naming the txt files, the flight grouping, flight date, and the receiver location (belly/wing) should be included (eg: F1_1-26-19_Belly.TXT) as this information will be used later on. Find the directory of the folder (my example files are in "D:/Jordy/flight-data/") and store the coordinate reference systems of the txt files (the proj4string) in the variable crs_in. The function read_flight_data can now be used to read the raw data into R and store it in the variable raw_data:
directory <- "D:/Jordy/flight-data/" crs_in <- "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0" raw_data <- read_flight_data(directory, crs_in)
Structurally, raw_data is a list of data.frames (there is one for each txt file in the folder). To see how these data.frames are ordered, run
names(raw_data)
The contents of "tburb_f1_1-26-19-WING.TXT" are stored in the 2nd data.frame in the list.
If a channel or date was misprogrammed, it needs to be corrected before the contents of raw_data can be combined into a single data.frame.
Suppose that channel 3 was misprogrammed as channel 10 in "tburb_f1_1-26-19-WING.TXT". To peek at the data run
head(raw_data[[2]])
To replace channel 10 with channel 3, the following line of code is run:
raw_data[[2]] <- channels_merge(raw_data[[2]], 10, 3) head(raw_data[[2]])
Suppose that a date was misprogrammed in the file "tburb_f5_10-17-19-WING.TXT". To peek at the data, run
head(raw_data[[16]])
If the correct flight date was "10/17/19", the following line of code will make the correction:
raw_data[[16]] <- replace_date(raw_data[[16]], new_date ="10/17/19") head(raw_data[[16]])
raw_data[[18]] <- replace_date(raw_data[[18]], new_date ="10/7/19")
The function combine_data combines all of the data stored in raw_data into a single data.frame.
An argument (source_vec) is provided so that the source of each txt file can be specified. The argument can, for instance, be used to specify whether a receiver was located on the belly or the wing of the aircraft. Because 26 txt files were contained in the example folder, a vector of length 26 is used to encode this information:
names(raw_data) source_vec <- c(rep(c("belly","wing"), 10), rep(c("wing","belly"),2), c("belly","wing")) source_vec
Now that the source of the data has been specified, the function combine_data can be used to combine the contents of raw_data:
all_data <- combine_data(raw_data, source_vec) head(all_data)
In order to use the telprep package, a SpatialLinesDataFrame representation of the river system must be imported into R. The following lines of code can be used to import a shapefile (named example.shp) as a SpatialLinesDataFrame object using the readOGR function from the rgdal package:
setwd("D:/Jordy/telprep/telprep/data/sf") sldf <- rgdal::readOGR("example.shp")
The coordinate reference system of the geographic data must match that of the detection data. If the coordinate reference systems do not match, the following line of code will convert the coordinate reference system of the geographic data to that of the detection data.
sldf <- sp::spTransform(sldf, attr(all_data, "crs"))
The riverdist package is used internally for calculations related to river proximity. To use the functionality of this package, sldf (from Step 4) must be converted into a river_network object. The riverdist function line2network can be used to make the conversion (the user is referred to the riverdist package for help).
river_net <- riverdist::line2network(sp=sldf, tolerance = 500)
The function rm_land_detects can be used to discard the detections that occurred away from the river system. To remove the detections that occurred more than 500 m away from a river channel and store the data in a variable called river_detects, the following line of code is run:
river_detects <- rm_land_detects(all_data, river_net, dist_thresh = 500)
The function get_best_locations can be used to determine the best location for each fish in each detection period. In short, the best location is considered to be the location where the highest power detection occurred during each flight period. The date_bins argument of the get_best_locations function specifies the start and end dates of the detection periods. These dates can be found and formatted using the get_date_bins function:
names(raw_data) flight_group <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,5,5,6,6,7,7,7,7,8,8) date_bins <- get_date_bins(raw_data, flight_group) date_bins
After the detection periods have been specified, the function get_best_locations can be used to determine the locations the fish:
best_locations <- get_best_locations(river_detects, date_bins = date_bins, bin_by=NA, n_thresh = 5, dist_max = 5000, remove_flagged = F)
head(best_locations$all_detects) head(best_locations$best_detects)
best_locations\$all_detects adds some useful to all_data: BestSignal is the signal with the highest power in a detection period, Dist is the Euclidean distance (in km) between the detection location and the associated highest power detection, FlightNum is the detection period, and Records is number of times that a fish was detected in a detection period. all_detects\$best_detects contains the highest power detections only. These detections are flagged if there are fewer than n_thresh detections within a distance of dist_max km from the best detection during the detection period. The detection will also be flagged if a positive linear relationship exists between Power and Dist for all detections within dist_max km from the best signal in the detection period (i.e. the signal strength increases as the best detection is approached).
Two functions (flag_dead_fish and hmm_survival) are provided to help determine the living status of fish.
flag_dead_fish uses locational information to determine which fish have expired. If a fish moves less than dist_thresh km for all consecutive detection periods following a detection, the fish will be flagged as dead. The following lines of code will flag for dead fish using this approach:
best_detects <- best_locations$best_detects head(best_detects) best_detects <- best_detects[best_detects$flag==F,] flagged_fish <- flag_dead_fish(best_detects, dist_thresh = 0.5) head(flagged_fish)
hmm_survival operates similarly to flag_dead_fish; however, this function uses a more sophisticated method to determine the living status of the fish. Briefly, this function utilizes locational and mortality sensor related information to determine the most likely path of survival states (called the viterbi path) for each fish using a Hidden Markov Model (HMM). A benefit to using a HMM based approach is that detection probabilities and survival rates are estimated using a statistical approach. These estimates are based on two key assumptions: 1) dead fish remain dead, and 2) dead fish do not move. A detailed description of the HMM can be found by running vignette("hmm") in the console.
The following lines of code will fit the HMM to determine the living status of the fish:
library(msm) hmm_out <- hmm_survival(best_detects) hmm_out$results viterbi <- hmm_out$viterbi head(viterbi)
In the column viterbi\$Viterbi, a value of 1 cooresponds to the event that the fish is alive whereas 2 cooresponds to the event that the fish has expired.
A plotting function (make_plot) is included in the telprep package. This function is designed to be used throughout the analysis. Example of how the function can be used are provided here:
# plotting all detections par(mfrow=c(1,1)) make_plot(sldf, all_data) # real detections only make_plot(sldf, best_detects) # darken background make_plot(sldf, best_detects, darken=2.5) # change style of background make_plot(sldf, best_detects, type="esri-topo") # give each fish a unique color preserved through flights par(mfrow=c(3,1)) make_plot(sldf, best_detects, col_by_fish=T, flight=1, darken=2.5) make_plot(sldf, best_detects, col_by_fish=T, flight=2, darken=2.5) make_plot(sldf, best_detects, col_by_fish=T, flight=3, darken=2.5) # to plot the locations for a single fish par(mfrow=c(1,1)) make_plot(sldf, best_detects, channel=10, tag_id=11, darken=2.5) # to zoom in to a specified extent extent <- c(x_min=466060, x_max=1174579, y_min=6835662, y_max=7499016) make_plot(sldf, best_detects, extent, darken=2.5) # plotting live and dead fish by flight period -- green fish have expired par(mfrow=c(3,1)) make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=1) make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=3) make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=5)
The function make_gif can be used to make dynamic plots (i.e. gifs). The user is referred to the function specific help files for more information; however, a few examples of the types of plots that can be created with these functions is provided below:
Plotting the most likely path of survival states for each fish -- live fish are red while green fish are dead:
knitr::include_graphics("S:/Jordy/telprep/telprep/gifs/viterbi/gif.gif")
Giving each fish a unique color that is preserved through flights:
knitr::include_graphics("S:/Jordy/telprep/telprep/gifs/byfish/gif.gif")
Finally the function make_survival_plot can be used to plot the number of expired fish (as determined by the viterbi path) by flight period:
make_survival_plot(viterbi)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.