In afstarke/reSET: Gather and analyze SET-MH data to assess marsh elevation trends

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(reSET)

Overview

The structure and naming conventions of the functions within reSET was inspired by the many bright minds of contributors of the tidyverse suite of packages. For the leveraging auto complete, all the primary functions begin with set_ and are split into families of functions based on the general task of the function. Currently there are 2 primary families:

set_get_** which retrieves data
set_check_** interrogates the data to help in QA or other steps. Currently the statistical analysis steps are being developed within this family but may spin off into a separate set of functions.

Usage

set_get_**

set_get_ functions take either a connection (dbConn or path to the DB) or an existing SET dataframe as returned from a previous set_get_ call and returns something to the user. Most often a dataset, that is in the form of a tidy dataframe is returned. In the case of set_get_stations an sf object (spatial data in the format of a simple feature object) is returned. This is convenient for quick mapping using packages like mapview or tmap.

The workflow begins with establishing a connection object with a database. In this case the NPS Access DB that is referenced in the 2015 Protocol developed by \@jamesc.lynch2015b. By passing the path to the back-end of the Access DB into the function set_get_DB(), you can create an object by assigning the output of the function to a variable, in case of the example below called DBconn. This object holds the information necessary for R to make the connection to the database. The benefit of this approach is that you can do data entry and management (QA?QC) in Access, which provides a better UI experience, and then pull the data direct into R for further work with just needing to run a few lines of code to update the data.

SET.DB.path <- "D:/Data/SET_data/SET_DB_BE_ver_2.94.mdb" # backend Access database from NPS

DBconn <- set_get_DB(SET.DB.path) 

DBconn
#> <OdbcConnection> admin@ACCESS
#>  Database: D:/Data/SET_data/SET_DB_BE_ver_2.94.mdb
#>  ACCESS Version: 04.00.0000

Once you've established the connection using set_get_DB() you can use that object in other set_get_ calls to:

Pull SET station locations out as an sf object for mapping using set_get_stations()

> marshHts <- set_get_receiver_elevations(dbconn = DBconn)
Joining, by = "Location_ID"
> marshHts
Simple feature collection with 39 features and 6 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -73.71823 ymin: 40.59572 xmax: -72.14885 ymax: 41.04341
Projected CRS: NAD83(2011) / New York Long Island
# A tibble: 39 x 7
   Survey_Date         Plot_Name Pipe_Z Vertical_Datum X_Coord Y_Coord             geometry
 * <dttm>              <chr>      <dbl> <chr>            <dbl>   <dbl>          <POINT [m]>
 1 2018-11-01 00:00:00 BC-1       0.394 NAVD88           -72.3    41.0 (-72.29104 41.04329)

Extract SET data in a tidy format using set_get_sets()
Extract Surface Accretion data into a tidy format using set_get_accretions()
set_get_receiver_elevations() will return an sf object with surveyed receiver elevations
set_get_absolute_heights() returns height-adjusted elevations of the marsh surface based on the NAVD88 surveyed receiver elevation and the SET equipment used including pin heights (see ?set_get_pinlengths and ?set_pinhts utility function.)

set_check_**

set_check_ functions help with some basic QA tasks. The current focus is on flagging and finding pin readings that may have issues that should be investigated. The focus of these functions is to leave the original raw data alone in the database and flag it for either filtering within a workflow in R or using it as a sort of check list for performing QA checks on the data within the database. It is important to know that all the functions within reSET do not alter the original data in it's source. That is the role of project managers during regular QA work flows.

set_check_measures() looks for errant data points based on incremental changes and notes made within database.
set_check_pins() returns a very simple tibble with Notes made within the database for a given pin. A unique pin ID is returned for use in additional filtering or queries both within Access or R $$Under development$$
set_check_doublereads() provides a mechanism to flag SET data that has multiple SET readers through its history. Teamed with set_get_doublereads() a user can find and handle those SET reader transitions as deemed appropriate. One method that is in development is to group the data by each SET reader, calculate the incremental change over time, and then within each unique SET reader perform a rolling sum or aggregation of the change observed for each reading, which in effect removes the 'bias' that may be brought in with a change in SET reader $$more on this approach to come$$. The downside to this approach is that any transitions between SET readers that were not 'double read' - a reading by each SET reader on the same day - would create a gap between the last reading made by the previous SET reader and the second reading made by the following SET reader which may leave large gaps. Further issues may arise if these changes were made more frequently (annually).