prepared for GLEON 20, GSA workshop 2018-12-03
Chris McBride & Kohji Muraoka, UoW, November 2018 correspondance to cmcbride@waikato.ac.nz
Install remotes package to remotely install G20 version of rB3
Web installation via git ('remotes' OR 'devtools')
### via package "remotes"
# install.packages("remotes")
remotes::install_github("kohjim/rB3", ref = "G20")
### via package "devtools"
# install.packages("devtools")
devtools::install_github("kohjim/rB3", ref = "G20")
library(rB3)
Local installation
### download directly from github
## https://github.com/kohjim/rB3/archive/G20.zip
Open the provided folder 'Demo' and the file Demo.Rproj
## check your working directory, and adjust if needs be
getwd()
# setwd("C:/")
## install rB3
#install.packages("devtools")
library(devtools)
install("../rB3-G20")
# load the rB3 library
library(rB3)
#install.packages("remotes")
##### If installing packages failed, manually install dependencies below:
# install.packages("tidyr")
# install.packages("ggplot2", dependencies = TRUE)
# install.packages("lubridate")
# install.packages("shiny")
# install.packages("circular")
why?..
..the POSIXct date format used by rB3 can play havoc with your data editing if timezones are not handled well..
..so it can help to set your system environment to UTC, which avoids issues with tz offsets
# set timezone - only change for this R session
Sys.setenv(TZ = "UTC")
Import a starting .csv file, which will be converted into a list of data frames:
the raw data block from your csv file; 'srcDF'
a copy of the raw data block, to be quality controlled; 'qcDF'
a matrix with similar dimensions to 1 & 2, to store qc action logID values; 'logDF'
a list of logID values and their meanings; 'logKey'
sensor/time-series metadata and control values used for filtering/plotting etc 'ctrls'
site/station meta data 'metaD'
For the documentation, ?csv2rB3
Intial csv header rows can contain time-series/sensor metadata to be used in later functions (loaded as 'ctrls' DF within list). Row prior to start of data will be data frame headers
Date format must be yyyy-mm-dd hh:mm:ss, with header "DateTime"
Import a raw dataset
## make sure the demo csv is in the same working dir
# setwd("C:/ ...")
rB3demo <- csv2rB3("rB3demo_201507-201806_RAW_R.csv","Lake_Rotoehu",-38.5, 176.5,"NZ")
call the components of the rB3 object on the fly (not needed for rB3 operations).
names(rB3demo)
You can access a data frame object by e.g.,
testDF <- rB3demo[["qcDF"]]
This module lets you investigate your data interactively using shiny package
For the documentation, ?shinyrB3
shinyrB3(rB3demo)
Note that shiny occupies R studio so you need to shut the Shiny window in order to action any more commands..
More future updates will come around this GUI
Variables can be called using key phrases/characters (i.e., all vars containing key word will be selected)
create vector of key phrases, for later functions, e.g.;
wqVars <- c('Fl','Tur','pH','DO')
?rB3getVars
rB3getVars(rB3demo, wqVars)
retrieve varNames; 'All' (default) or select by keyphrase
?rB3getVars
rB3getVars(rB3demo, 'All')
Trim and standardize time intervals of a data frame
Our demo rawDF has 3 yrs data, some with 5 min data, some 15 min.
So let's trim dataset to most recent 2 years, and aggregate to common (15 min) timestep, using aggregation methods specific to each column as defined in the header metadata (ctrls$methodAgg).
rB3demo[["ctrls"]]$methodAgg
For example, here we'll aggregate by mean, but sum for rainfall, and circular averaging for wind direction
?rB3stdze
# aggregate the data to 15 min timestep - can take a while on big DFs!
rB3agg <- rB3stdze(rB3in = rB3demo,
varNames = 'All',
startDate = '2016-07-01',
endDate = '2018-06-30 23:45:00',
timestep = 15,
aggAll = FALSE)
?varWrangle
# add variable(s) after 3rd variable (3rd excluding DateTime)
rB3agg <- varWrangle(rB3agg,
varNames = "TESTVAR",
task = "add",
loc = 4)
rB3getVars(rB3agg)
# remove variable(s) by "keyword" TESTVAR
rB3agg <- varWrangle(rB3agg,
varNames = "TESTVAR",
task = "rm")
rB3getVars(rB3agg)
?rB3gg
# plot the variables called by the the keywords, saved to figures dir
rB3gg(rB3in = rB3agg,
varNames = c("TmpWtr.d00050","TmpWtr.d00150"),
srcColour = 'grey34',
facet = TRUE,
showPlot = TRUE)
rB3gg(rB3in = rB3agg,
varNames = 'DOpsat',
srcColour = 'grey34',
facet = FALSE,
showPlot = TRUE,
savePlot = 'figures/RAW_WQ_',
dpi = 400)
Backup the aggregated data frame, in case we want to revert later
rB3agg2 <- rB3agg
shinyrB3(rB3agg2)
This function replace values in specified regions of data with a numerical value or with NA
?assignVal
Select a region from your shiny plot containing erroneous data, then paste the example function, e.g.:
rB3agg2 <- assignVal(rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
startDate = "2017-06-15 23:05:14",
endDate = "2017-07-06 11:00:38",
minVal = 12,
maxVal = 22.9,
newVal = NA,
logID = "Shiny",
Reason = "Manual removal",
showPlot = T)
Replace values exceedign specified rate of change with NA
?filterRoc
rB3agg2 <- filterRoc(rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
maxRoc = 0.5,
showPlot = T)
If showPlot is TRUE, so you must enter your choice (1 = accept, 2 = decline) to continue
Replace data where identical value has been repeated more than n = maxReps
?filterReps
rB3agg2 <- filterReps(rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
maxReps = 20,
showPlot = T)
Filter data below minVal or above maxVal (either specified, or from 'ctrls'/headers)
?filterMinMax
rB3agg2 <- filterMinMax(rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
filterMin = 9,
filterMax = 25,
showPlot = T)
?applyInterp
rB3agg2 <- applyInterp(rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
showPlot = T)
?logsPlot
visualise changes to data
logsPlot(rB3in = rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
srcColour = 'grey')
?rB3gg
View the final before and after, without logs
rB3gg(rB3in = rB3agg2,
varNames = c('TmpWtr.d00050','TmpWtr.d00150'),
srcColour = 'orange',
qcColour = 'blue') #, savePlot = 'figures/RAW_WQ_', dpi = 400)
Export data from the rB3 object into csv files
?rB3export
rB3export(rB3agg2,
varNames = 'All',
qc = T,
src = T,
metadata = T)
rB3agg3 <- rB3agg2
Apply a mathematical transformation, e.g. new = a + b(old) + c(old)^2 + c(old)^3 + ...etc
?applyNth
rB3agg3 <- applyNth(rB3in = rB3agg3,
startDate = '2016-07-01 00:00:00',
endDate = '2017-06-28 23:45:00',
varNames = 'DOpsat.d00050',
coeffs = c(12,1,0.02),
showPlot = T)
Correct linear sensor drift (assumes consistent timestep)
?driftCorr
rB3agg3 <- driftCorr(rB3agg3,
'2016-07-01 00:00:00',
'2017-06-28 23:45:00',
'DOpsat.d00050',
lowRef = 0,
lowStart = 0,
lowEnd = 0,
highRef = 100,
highStart = 85,
highEnd = 130,
showPlot = T)
?tmprAlign
Post-calibrate temperature sensors based on periods of mixing, as found by temp differences and wind speed (optional)
Pre tmprAlign()
rB3gg(rB3in = rB3agg3,
varNames = 'TmpWtr',
startDate = '2017-07-01',
endDate = '2017-08-01',
facet = FALSE,
showPlot = T)
rB3agg3 <- tmprAlign(rB3agg3,
varNames = 'TmpWtr',
dTPerctile = 0.2,
logID = "tpmAlign",
Reason = "Interp",
showPlot = T,
plotType = 'All')
Post tmprAlign()
rB3gg(rB3in = rB3agg3,
varNames = 'TmpWtr',
startDate = '2017-07-01',
endDate = '2017-08-01',
facet = FALSE,
showPlot = T)
?FUNCrB3 Apply a custom function using rB3
Simple example:
multiply a variable by 2
# define a simple function ( result = input variable * 2)
test <- function(eqnVars) {eqnVars[1] * 2}
# apply this custom function
rB3agg3 <- FUNCrB3(rB3agg3,
varNames = 'DOpsat.d00050',
eqnVars = 'DOpsat.d00050',
FUN = test,
showPlot = T)
Complex example:
calculate DO (mg/L) using ( DO (%sat) and water temperature ) using USGS method..
Meyers, D.N. (2011) https://water.usgs.gov/admin/memo/QW/qw11.03.pdf
eqn: DOmg = (exp(-139.34411 + ((157570.1(1/( tmpwtr +273.15))) + (-66423080((1/( tmpwtr +273.15))^2)) + (12438000000((1/( tmpwtr +273.15))^3)) + (-862194900000((1/( tmpwtr +273.15))^4))))) * DOsat *0.01
..where tmpwtr = water temperature and DOsat = dissolved oxygen saturation
rB3gg(rB3agg3,
varNames = c('DOpsat.d00050','DOconc.d00050'),
showPlot = T,
srcColour = 'orange',
qcColour = 'blue')
# define the list of input variables required for the calculation
eqnVars = c('TmpWtr.d00050','DOpsat.d00050')
# define the function, using eqnVars[1] - tmpwtr and eqnVars[2] = DOsat
DOsat2mg <- function(eqnVars) {
(exp(-139.34411 + ((157570.1*(1/( eqnVars[1] +273.15))) +
(-66423080*((1/( eqnVars[1] +273.15))^2)) +
(12438000000*((1/( eqnVars[1] +273.15))^3)) +
(-862194900000*((1/( eqnVars[1] +273.15))^4))))) * eqnVars[2] * 0.01
}
rB3agg3 <- FUNCrB3(rB3agg3,
varNames = 'DOconc.d00050',
eqnVars = eqnVars,
FUN = DOsat2mg,
showPlot = T)
write cleaned up temperature and wind data to .wtr and .wnd files for direct input to rLakeAnalyzer
?writeLAinputs
writeLAinputs(rB3in = rB3agg3,
wtrNames = 'TmpWtr',
wndName = 'WndSpd',
wndHeight = 1.5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.