ContDataQC: Master Script

View source: R/zfun.ContDataQC.R

ContDataQCR Documentation

Master Script

Description

Calls all other functions for the ContDataQC library.

Usage

ContDataQC(
  fun.myData.Operation,
  fun.myData.SiteID,
  fun.myData.Type,
  fun.myData.DateRange.Start,
  fun.myData.DateRange.End,
  fun.myDir.import = getwd(),
  fun.myDir.export = getwd(),
  fun.myConfig = "",
  fun.myFile = "",
  fun.myReport.format = "",
  fun.myReport.Dir = "",
  fun.CreateReport = TRUE,
  fun.AddDeployCol = TRUE
)

Arguments

fun.myData.Operation

Operation to be performed; c("GetGageData","QCRaw", "Aggregate", "SummaryStats")

fun.myData.SiteID

Station/SiteID.

fun.myData.Type

data type; c("Air","Water","AW","Gage","AWG","AG","WG")

fun.myData.DateRange.Start

Start date for requested data. Format = YYYY-MM-DD.

fun.myData.DateRange.End

End date for requested data. Format = YYYY-MM-DD.

fun.myDir.import

Directory for import data. Default is current working directory.

fun.myDir.export

Directory for export data. Default is current working directory.

fun.myConfig

Configuration file to use for this data analysis. The default is always loaded first so only "new" values need to be included. This is the easiest way to control time zones.

fun.myFile

Single file (or vector of files) to perform functions. SiteID, Type, and Date Range not used when file name(s) provided.

fun.myReport.format

Report format (docx or html). Default is specified in config.R (docx). Can be customized in config.R; ContData.env$myReport.Format.

fun.myReport.Dir

Report (rmd) template folder. Default is the package rmd folder. Can be customized in config.R; ContData.env$myReport.Dir.

fun.CreateReport

Boolean parameter to create reports or not. Default = TRUE.

fun.AddDeployCol

Boolean for adding column name. Default = TRUE Can be customized in config.R; ContData.env$myName.LoggerDeployment.

Details

Below are the default data directories assumed to exist in the working directory. These can be created with code in the example.

./Data0_Original/ = Unmodified data logger files.

./Data1_RAW/ = Data logger files modified for use with library. Modifications for extra rows and file and column names.

./Data2_QC/ = Repository for library output for QCed files.

./Data3_Aggregated/ = Repository for library output for aggregated (or split) files.

./Data4_Stats/ = Repository for library output for statistical summary files.

It is possible to call the Aggregate file portion of the script to meld together files from multiple sites (e.g., all sites in a watershed or from different depths on a lake). The file will be named by the first file named with "Append_x" where "x" is the number of files that were aggregated. The purpose is to allow users to analyze the data in these files from a single file.

Pandoc is needed for docx reports (default). Pandoc comes packaged with RStudio. To install Pandoc on Windows use the 'installr' package.

https://CRAN.R-project.org/package=installr

install.packages("installr") installr::install.pandoc()

The above won't work if don't have admin rights on your computer. Alternative = Download the msi file from the link below for the latest release. You may need to have your IT dept install it for you. https://github.com/jgm/pandoc/releases For help for installing via command window: http://www.intowindows.com/how-to-run-msi-file-as-administrator-from-command-prompt-in-windows/

Value

Returns a csv into the specified export directory with additional columns for calculated statistics.

Examples

# Examples of each operation

# 00. Set up
# Parameters
Selection.Operation <- c("GetGageData"
                         , "QCRaw"
                         , "Aggregate"
                         , "SummaryStats")
Selection.Type      <- c("Air","Water","AW","Gage","AWG","AG","WG")
Selection.SUB <- c("Data0_Original"
                   , "Data1_RAW"
                   , "Data2_QC"
                   , "Data3_Aggregated"
                   , "Data4_Stats")
(myDir.BASE <- tempdir()) # create and print temp directory for example data

# Create data directories
myDir.create <- file.path(myDir.BASE, Selection.SUB[1])
  ifelse(dir.exists(myDir.create) == FALSE
         , dir.create(myDir.create)
         , "Directory already exists")
myDir.create <- file.path(myDir.BASE, Selection.SUB[2])
  ifelse(dir.exists(myDir.create) == FALSE
         , dir.create(myDir.create)
         , "Directory already exists")
myDir.create <- file.path(myDir.BASE, Selection.SUB[3])
  ifelse(dir.exists(myDir.create) == FALSE
         , dir.create(myDir.create)
         , "Directory already exists")
myDir.create <- file.path(myDir.BASE, Selection.SUB[4])
  ifelse(dir.exists(myDir.create) == FALSE
         , dir.create(myDir.create)
         , "Directory already exists")
myDir.create <- file.path(myDir.BASE, Selection.SUB[5])
  ifelse(dir.exists(myDir.create) == FALSE
         , dir.create(myDir.create)
         , "Directory already exists")

# Save example data (assumes myDir.BASE directory exists)
myData <- data_raw_test2_AW_20130426_20130725
  write.csv(myData, file.path(myDir.BASE
                              , Selection.SUB[2]
                              , "test2_AW_20130426_20130725.csv"))
myData <- data_raw_test2_AW_20130725_20131015
  write.csv(myData, file.path(myDir.BASE
                              , Selection.SUB[2]
                              , "test2_AW_20130725_20131015.csv"))
myData <- data_raw_test2_AW_20140901_20140930
  write.csv(myData, file.path(myDir.BASE
                              , Selection.SUB[2]
                              , "test2_AW_20140901_20140930.csv"))
myData <- data_raw_test4_AW_20160418_20160726
  write.csv(myData, file.path(myDir.BASE
                              , Selection.SUB[2]
                              , "test4_AW_20160418_20160726.csv"))
myFile <- "config.TZ.Central.R"
  file.copy(file.path(path.package("ContDataQC"), "extdata", myFile)
            , file.path(myDir.BASE, Selection.SUB[2], myFile))

# 01.A. Get Gage Data
myData.Operation       <- "GetGageData" #Selection.Operation[1]
myData.SiteID          <- "01187300" # Hubbard River near West Hartland, CT
myData.Type            <- Selection.Type[4] #"Gage"
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End   <- "2014-12-31"
myDir.import           <- ""
myDir.export           <- file.path(myDir.BASE, Selection.SUB[2])
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export)

# 01.B. Get Gage Data (central time zone)
myData.Operation       <- "GetGageData" #Selection.Operation[1]
myData.SiteID          <- "07032000" # Mississippi River at Memphis, TN
myData.Type            <- Selection.Type[4] #"Gage"
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End   <- "2014-12-31"
myDir.import           <- ""
myDir.export           <- file.path(myDir.BASE, Selection.SUB[2])
# include path if not in working directory
myConfig               <- file.path(myDir.BASE, Selection.SUB[2]
                                    , "config.TZ.central.R")
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export
           , myConfig)

# 02.A. QC Raw Data
myData.Operation       <- "QCRaw" #Selection.Operation[2]
myData.SiteID          <- "test2"
myData.Type            <- Selection.Type[3] #"AW"
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End   <- "2014-12-31"
myDir.import           <- file.path(myDir.BASE, Selection.SUB[2]) #"Data1_RAW"
myDir.export           <- file.path(myDir.BASE, Selection.SUB[3]) #"Data2_QC"
myReport.format        <- "docx"
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export
           , fun.myReport.format = myReport.format)

# 02.B. QC Raw Data (offset collection times for air and water sensors)
myData.Operation       <- "QCRaw" #Selection.Operation[2]
myData.SiteID          <- "test4"
myData.Type            <- Selection.Type[3] #"AW"
myData.DateRange.Start <- "2016-04-28"
myData.DateRange.End   <- "2016-07-26"
myDir.import           <- file.path(myDir.BASE, Selection.SUB[2]) #"Data1_RAW"
myDir.export           <- file.path(myDir.BASE, Selection.SUB[3]) #"Data2_QC"
myReport.format        <- "html"
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export
           , fun.myReport.format = myReport.format)

# 03. Aggregate Data
myData.Operation       <- "Aggregate" #Selection.Operation[3]
myData.SiteID          <- "test2"
myData.Type            <- Selection.Type[3] #"AW"
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End   <- "2014-12-31"
myDir.import           <- file.path(myDir.BASE, Selection.SUB[3]) #"Data2_QC"
myDir.export           <- file.path(myDir.BASE, Selection.SUB[4]) #"Data3_Aggregated"
#Leave off myReport.format and get default (docx).
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export)

# 04. Summary Stats
myData.Operation       <- "SummaryStats" #Selection.Operation[4]
myData.SiteID          <- "test2"
myData.Type            <- Selection.Type[3] #"AW"
myData.DateRange.Start <- "2013-01-01"
myData.DateRange.End   <- "2014-12-31"
myDir.import           <- file.path(myDir.BASE, Selection.SUB[4]) #"Data3_Aggregated"
myDir.export           <- file.path(myDir.BASE, Selection.SUB[5]) #"Data4_Stats"
#Leave off myReport.format and get default (docx).
ContDataQC(myData.Operation
           , myData.SiteID
           , myData.Type
           , myData.DateRange.Start
           , myData.DateRange.End
           , myDir.import
           , myDir.export)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# File Versions
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

# 02.Alt. QC Data
myData.Operation <- "QCRaw" #Selection.Operation[2]
#myFile <- "test2_AW_20130426_20130725.csv"
myFile <- c("test2_AW_20130426_20130725.csv"
           , "test2_AW_20130725_20131015.csv"
           , "test2_AW_20140901_20140930.csv")
myDir.import <- file.path(myDir.BASE, "Data1_RAW")
myDir.export <- file.path(myDir.BASE, "Data2_QC")
myReport.format <- "docx"
ContDataQC(myData.Operation
           , fun.myDir.import = myDir.import
           , fun.myDir.export = myDir.export
           , fun.myFile = myFile
           , fun.myReport.format = myReport.format)

# 03.Alt. Aggregate Data
myData.Operation <- "Aggregate" #Selection.Operation[3]
myFile <- c("QC_test2_Aw_20130426_20130725.csv"
           , "QC_test2_Aw_20130725_20131015.csv"
           , "QC_test2_Aw_20140901_20140930.csv")
myDir.import <- file.path(myDir.BASE, "Data2_QC")
myDir.export <- file.path(myDir.BASE, "Data3_Aggregated")
myReport.format <- "html"
ContDataQC(myData.Operation
           , fun.myDir.import = myDir.import
           , fun.myDir.export = myDir.export
           , fun.myFile = myFile
           , fun.myReport.format = myReport.format)

# 04. Alt. Summary Stats
myData.Operation <- "SummaryStats" #Selection.Operation[4]
myFile <- "QC_test2_AW_20130426_20130725.csv"
#myFile <- c("QC_test2_AW_20130426_20130725.csv"
#            , "QC_test2_AW_20130725_20131015.csv"
#            , "QC_test2_AW_20140901_20140930.csv")
myDir.import <- file.path(myDir.BASE, "Data2_QC")
myDir.export <- file.path(myDir.BASE, "Data4_Stats")
#Leave off myReport.format and get default (docx).
ContDataQC(myData.Operation
           , fun.myDir.import = myDir.import
           , fun.myDir.export = myDir.export
           , fun.myFile = myFile)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Summary Stats from Other Data
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# 05. Gage Data
# Get Gage Data via the dataRetrieval package from USGS 01187300 2013
#  (~4 seconds)
data.gage <- dataRetrieval::readNWISuv("01187300"
                                       , "00060"
                                       , "2013-01-01"
                                       , "2014-12-31")
head(data.gage)
# Rename fields
myNames <- c("Agency"
             , "SiteID"
             , "Date.Time"
             , "Discharge.ft3.s"
             , "Code"
             , "TZ")
names(data.gage) <- myNames
# Add Date and Time
data.gage[,"Date"] <- as.Date(data.gage[,"Date.Time"])
data.gage[,"Time"] <-  strftime(data.gage[,"Date.Time"], format="%H:%M:%S")
# Add "flag" fields that are added by QC function.
Names.Flags <- paste0("Flag.",c("Date.Time", "Discharge.ft3.s"))
data.gage[,Names.Flags] <- "P"
# Save File
myFile <- "01187300_Gage_20130101_20141231.csv"
write.csv(data.gage, file.path(myDir.BASE, myFile), row.names=FALSE)
# Run Stats (File)
myData.Operation <- "SummaryStats"
myDir.import <- myDir.BASE
myDir.export <- myDir.BASE
ContDataQC(myData.Operation
           , fun.myDir.import = myDir.import
           , fun.myDir.export = myDir.export
           , fun.myFile = myFile)

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Lake Data, Aggregate
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
(myDir.BASE <- tempdir()) # create and print temp directory for example data
# 06. Lake Data
# Save example data (assumes directory exists)
myFile <- c("QC_Ellis--1.0m_Water_20180524_20180918.csv"
           , "QC_Ellis--3.0m_Water_20180524_20180918.csv")
file.copy(file.path(system.file("extdata", package="ContDataQC"), myFile)
          , file.path(myDir.BASE, "Data2_QC", myFile))

# Aggregate Data
myData.Operation <- "Aggregate" #Selection.Operation[3]
myFile           <- myFile
myDir.import     <- file.path(myDir.BASE, "Data2_QC")
myDir.export     <- file.path(myDir.BASE, "Data3_Aggregated")
myReport.format  <- "html"
ContDataQC(myData.Operation
           , fun.myDir.import = myDir.import
           , fun.myDir.export = myDir.export
           , fun.myFile = myFile
           , fun.myReport.format = myReport.format)


leppott/ContDataQC documentation built on Jan. 5, 2025, 10:12 a.m.