knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" #out.width = "100%" ) library(badger) library(dplyr) library(glue) library(kableExtra) library(usethis) repo <- "centre-for-marine-applied-research/strings"
knitr::include_graphics("https://github.com/Centre-for-Marine-Applied-Research/strings/blob/master/man/figures/README-strings_hex.png")
r badge_devel(repo, "blue")
r badge_codefactor(repo)
r badge_github_actions(repo)
This package will soon be deprecated in favor of the sensorstrings
package located here.
Compile, format, and visualize Water Quality (temperature, dissolved oxygen, salinity) data measured by different sensors.
You can install the development version of strings
from GitHub with:
# install.packages("devtools") devtools::install_github("Centre-for-Marine-Applied-Research/strings")
The Centre for Marine Applied Research (CMAR) coordinates an extensive Coastal Monitoring Program to measure Essential Ocean Variables from around the coast of Nova Scotia, Canada. There are three main branches of the program: Water Quality, Currents, and Waves. Processed data for each branch can be viewed and downloaded from several sources, as outlined in the CMAR Report & Data Access Cheat Sheet (download for clickable links).
The strings
package is used to compile, format, and visualize data from the Water Quality branch of the Coastal Monitoring Program.
Water Quality data (temperature, dissolved oxygen, and salinity) is collected using “sensor strings”. Each sensor string is attached to the seafloor by an anchor and suspended by a sub-surface buoy, with autonomous sensors attached at various depths (Figure 1). A string typically includes three sensor models: Hobo, aquaMeasure, and VR2AR (Table 1). Strings are deployed at a station for several months and data are measured every 1 minute to 1 hour, depending on the sensor.
knitr::include_graphics("https://github.com/Centre-for-Marine-Applied-Research/strings/blob/master/man/figures/README-fig1.png")
Figure 1: Typical sensor string configuration (not to scale).
After retrieval, data from each sensor is exported to a separate csv file using manufacturer-specific software. Each type of sensor generates a data file with unique columns and header fields, which poses a significant challenge for compiling all data from a deployment into a single format for analysis.
The strings
package was originally built to address this challenge, and now offers functions to compile, format, convert units, and visualize sensor string data.
strings
was developed specifically to streamline CMAR’s workflow, but is flexible enough that other users can apply it to process data from the accepted sensors (Table 1). Refer to the vignettes for more detail.
tibble( SENSORS = c( "HOBO Pro V2", "HOBO DO", "aquaMeasure DOT", "aquaMeasure SAL", "VR2AR" ), urls = c( "https://www.onsetcomp.com/datasheet/U22-001", "https://www.onsetcomp.com/datasheet/U26-001", "https://www.innovasea.com/wp-content/uploads/2021/07/Innovasea-Aquaculture-Intelligence-Spec-Sheet-062221.pdf", "https://www.innovasea.com/wp-content/uploads/2021/07/Innovasea-Aquaculture-Intelligence-Spec-Sheet-062221.pdf", "https://www.innovasea.com/wp-content/uploads/2021/06/Innovasea-Fish-Tracking-vr2ar-data-sheet-0621.pdf" ), `Variable(s) Measured` = c( "Temperature", "Temperature, Dissolved Oxygen", "Temperature, Dissolved Oxygen", "Temperature, Salinity", "Temperature" ) ) %>% mutate(`Sensor (link to spec sheet)` = glue::glue("[{SENSORS}]({urls})")) %>% select(`Sensor (link to spec sheet)`, `Variable(s) Measured`) %>% kable(align = "ll", format = "pipe")
For more information on Water Quality data collection and processing, visit the CMAR Water Quality Data Collection & Processing Cheat Sheet (download for clickable links).
library(strings) library(readr)
Consider a string deployed from May 31, 2019 to October 19, 2019 with three sensors:
tibble( "Sensor" = c("HOBO Pro V2", "aquaMeasure DOT", "VR2AR"), "Serial #" = c("10755220", "670364", "547109"), "Depth" = c("2", "5", "15") ) %>% kable(align = "lcc")
The data from each sensor is exported to a separate csv file, each with manufacturer-specific columns.
Import raw data files:
path <- system.file("extdata", package = "strings") hobo_raw <- read_csv(paste0(path, "/HOBO/10755220.csv")) aquaMeasure_raw <- read_csv(paste0(path, "/aquaMeasure/aquaMeasure-670364_2019-10-19_UTC.csv")) vemco_raw <- read_csv(paste0(path, "/Vemco/Vemco_Borgles_Island_2019_05_30.csv"))
Examine the first rows of each raw data file:
Raw Hobo data
head(hobo_raw)
Raw aquaMeasure data
head(aquaMeasure_raw)
Raw Vemco data
head(vemco_raw)
Data from each sensor is exported in a slightly different layout, making it difficult to work with and analyze all of the data from a single deployment.
strings
Compile data from the 3 sensors using strings::compile_all_data()
:
deployment <- data.frame(START = "2019-05-30", END = "2019-10-19") serial.table.HOBO <- data.frame(SENSOR = "HOBO-10755220", DEPTH = "2m") serial.table.aM <- data.frame(SENSOR = "aquaMeasure-670364", DEPTH = "5m") depth.vemco <- "15m" #Compile data from a single deployment ALL_data <- compile_all_data( path = path, deployment.range = deployment, area.name = area, # hobo serial.table.HOBO = serial.table.HOBO, # aquaMeasure serial.table.aM = serial.table.aM, # vemco depth.vemco = depth.vemco ) head(tibble(ALL_data), n = 10)
The data is compiled in a "wide" format, with metadata in the first four rows indicating the deployment period, the sensor serial number, the variable and depth of the sensor, and the timezone of the timestamps.
The remaining columns alternate between timestamp (in the format "Y-m-d H:M:S") and variable value (rounded to three decimal places). Sensors can be initialized at different times and record on different intervals, so values in a single row do not necessarily correspond to the same timestamp.
This format is an artifact of the former compiling process and will be removed for future versions of the package. The format is convenient for human readers, who can quickly scan the metadata to determine the number of sensors deployed, the depths of deployment, etc. However, this format is less convenient for analysis. The data frame should be converted to a "tidy" format using strings::convert_to_tidydata()
prior to analysis.
ALL_tidy <- convert_to_tidydata(ALL_data) head(tibble(ALL_tidy))
ALL_tidy
as 6 columns:
DEPLOYMENT_RANGE
: The deployment and retrieval dates (character)SENSOR
: The sensor that recorded the measurement (character)TIMESTAMP
: The timestamp of the measurement (POSIXct)VARIABLE
: The parameter measured (Temperature, Dissolved Oxygen, or Salinity) (character)DEPTH
: The depth of the sensor (ordered factor)VALUE:
The value of the measurement (numeric)ALL_tidy
can be plotted with plot_variables_at_depth()
:
plot_variables_at_depth(ALL_tidy)
If the figure needs to be modified, it may be more convenient to plot using ggplot_variables_at_depth
:
library(ggplot2) library(lubridate) ggplot_variables_at_depth(ALL_tidy) + geom_vline(xintercept = as_datetime("2019-07-01"), colour = "red")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.