pkgs.used <- c("knitr", "rmarkdown", "ggplot2", "scales", "SFNRC") pkgs.to.install <- pkgs.used[!pkgs.used %in% installed.packages()] if (length(pkgs.to.install) > 0) { install.packages(pkgs.to.install) } if (!"SFNRC" %in% installed.packages()) { remotes::install_github("troyhill/SFNRC") } lapply(X = pkgs.used, FUN = require, character.only = TRUE) theme_set(theme_bw()) knitr::opts_chunk$set(echo = TRUE, comment=NA)
\vspace{12pt} \vspace{12pt}
The SFNRC
R package includes a simple, streamlined means of interacting with the DBHYDRO hydrology and water quality databases. This vignette demonstrates how to use the DBHYDRO-related functions.
The SFNRC
R package is hosted on GitHub and can be installed from the R console using the remotes package, as shown below. The package only needs to be installed once.
\vspace{12pt}
# install ggplot2 if needed install.packages("ggplot2") # use the remotes package to install from GitHub remotes::install_github("troyhill/SFNRC@master")
\vspace{12pt}
Each script that uses functions from the SFNRC
R package should load the package with the command:
library(SFNRC)
\vspace{12pt}
\vspace{3mm}\hrule \vspace{12pt}
\vspace{12pt} \vspace{12pt}
# if the NitrogenUptake2016 package isn't installed, use devtools to do so: # set some constants todaysDate <- substr(as.character(Sys.time()), 1, 10) pointSize <- 2 # for ggplot graphics pd <- pd2 <- position_dodge(1.2) pd3 <- position_dodge(0.8) grayColor <- 0.55 fig2Col <- "gray55"
\vspace{12pt} \vspace{12pt}
\vspace{12pt}
The starting point for DBHYDRO hydrologic data is identifying the relevant DBKey. getDBkey displays the DBkeys associated with a structure. The input is not case sensitive, and partial searches will return broader results. The default behavior for getDBkey()
is to only return active DBkeys; those with new data added during the past 90 days. Setting the argument activeOnly = FALSE
returns all DBkeys. Sometimes site names may be different in DBHYDRO than in other databases (e.g., the 'P33' station in DataForEver is 'NP-P33' in DBHYDRO). In other cases, a hyphen or space may be missing or added in. If you think a site exists but getDBkey isn't returning anything, a trip to the DBHYDRO website to check the site name (or find the relevant dbkey) may save you some time.
\vspace{12pt}
getDBkey("s333") # returns actively used dbkeys by default getDBkey("s333", activeOnly = FALSE) # returns *all* dbkeys tied to S-333 getDBkey("s33") # returns all stations starting with "s33"
\vspace{12pt}
Identifying DBkeys can be labor intensive, but use of scripting means it only needs to be done once. Once a DBkey is identified we can download data directly from R. This is done with the getHydro()
function:
### we'll pull data for some time period beginDate <- "20200401" finalDate <- "20201001" flowDat <- getHydro(dbkey = "15042", startDate = beginDate, endDate = finalDate) head(flowDat)
\vspace{12pt}
This operation can be easily performed on larger numbers of stations with use of lapply
, do.call
, and rbind
. For example, if one wanted to pull data for several stations at the northern boundary of Everglades National Park:
\vspace{12pt}
target.keys <- c("03620", "03626", "03632", "03638", "91487", "64136") flow.data <- do.call(rbind, lapply(target.keys, getHydro, startDate = beginDate, endDate = finalDate))
\vspace{12pt}
The above data are in a ggplot-friendly format and can be easily plotted.
\vspace{12pt}
### these plotting commands use the ggplot2 package. To install it: ### install.packages("ggplot2") library(ggplot2) ggplot(flow.data, aes(y = value, x = date)) + geom_area(aes(fill= stn), color = "gray50", alpha = 0.3, position = 'stack') + theme_classic() + ylab("Daily flow into ENP northern boundary (cfs)") + xlab("") + theme(legend.title=element_blank()) + scale_fill_brewer(palette = "RdYlGn") + ylim(0, 3700)
### no-flow data can appear as zeroes or NAs. ### changing NAs to zero can improve the behavior of stacked charts. flow.data2 <- flow.data flow.data2$value[is.na(flow.data2$value)] <- 0 ggplot(flow.data2, aes(y = value, x = date)) + geom_area(aes(fill= stn), color = "gray50", alpha = 0.3, position = "stack") + theme_classic() + ylab("Daily flow into ENP northern boundary (cfs)") + xlab("") + theme(legend.title=element_blank()) + scale_fill_brewer(palette = "RdYlGn") + ylim(0, 3700)
\vspace{12pt}
\vspace{12pt}
Using SFNRC
to download DBHYDRO's water quality data is very straightforward. The water quality database is accessed using the function getWQ()
. The only critical input is the name of the station according to DBHYDRO (including DBHYDRO's hyphens, capitalization, etc.). Note that this function can be slow because it downloads the full period of record.
The parameters
argument accepts a regex-style input specifying the parameters desired.
The outputType
argument is very useful for returning data in the appropriate shape for your intended analysis. outputType = 'full'
returns a long dataset with all samples (possibility of multiple samples on a single day). Setting outputType
to "long" or "wide" average duplicate samples and returns a long dataset (one column with parameter names and one column with parameter values) or a wide dataset (one column of values for each parameter). The "long" form reports more information (units, MDL, PQL, RDL) than the "wide" form, which only reports values for each parameter.
\vspace{12pt}
longDat <- getWQ(stn = "s333", outputType = "long") wideDat <- getWQ(stn = "s333", outputType = "wide") tail(longDat) tail(wideDat) # identify desired stations and water quality parameters stations <- c("S333") wqParams <- c("PHOSPH|NITROGEN|TURBIDITY") # partial matches are acceptable wqDat <- getWQ(stn = stations, parameters = wqParams)
\vspace{12pt}
The water quality parameters available at a station, and the number of samples for each, can be shown with the getDBHYDROparams()
function. This function is a bit slow because it has to download all available data, but it can be useful if you're not sure what's available at a station.
stnDat <- getDBHYDROparams(stn = "s333") head(stnDat)
\vspace{12pt}
These functions provide a foundation for a variety of workflows. For one example, water quality and hydrology data can be merged and analyzed together.
\vspace{12pt}
stations <- c("S333", "S12D") ### download water quality data for multiple stations. rbind.fill joins the ### station-level data while accommodating missing parameters wqDat <- do.call(plyr::rbind.fill, lapply(stations, getWQ, parameters = wqParams)) ### allDat <- plyr::join_all(list(wqDat, flow.data), by = c("stn", "date", "year", "mo", "day")) ggplot(allDat, aes(y = PHOSPHATETOTALASP, x = value, col = stn)) + geom_point() + facet_wrap(~stn) + ylab("Phosphate (total; mg P/L)") + xlab("Daily flow (cfs)")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.