dvget: Retrieve USGS Discharge Daily Values structured for...

Description Usage Arguments Value Note Author(s) References See Also Examples

Description

Retrieve U.S. Geological Survey (USGS) daily mean-streamflow values for a streamgage identification number. This function is a wrapper on dataRetrieval::readNWISdv() and thus provides an abstraction layer to the dataRetrieval package but this wrapper has powerful features suitable for data-mining scale study of daily values. The dvget function adds some additional information for purpose of the akqdecay package. The function creates the canonical daily-mean streamflow table (an R data.frame) to be picked up in turn by the akqdecay function of this package. The function fill_dvenv is a wrapper that can be used to fill an R environment with the output of the akqdecay function.

Usage

1
2
3
4
5
dvget(siteNumber, sdate="",   edate="", flowlo=NULL, flowhi=NULL,
                  date2s=NA,  date2e=NA, ignore.working=TRUE,
                  ignore.provisional=TRUE, silent=FALSE,
                  drsilent=TRUE, drget=FALSE, pCode="00060", sCode="00003",
                  message="", ...)

Arguments

siteNumber

USGS streamgage identification number and nomenclature matches that of the dataRetrieval package. Multiple site numbers can be provided because the underlying dataRetrieval::readNWISdv() supports that and the resulting behavior should be about the same between these two functions. A warning message though is shown. This is important because akqdecay is not vectorized to handle multiple streamgages and will trigger a stop() in execution—Multiple streamgages should be processed through fill_dvenv and fill_akqenv;

sdate

Start date (default is earliest) and nomenclature matches that of the dataRetrieval package with string format of “YYYY-MM-DD” for year (YYYY), month (MM), and day (DD) respectively padded by zeros as needed;

edate

Ending date (default is earliest) and nomenclature matches that of the dataRetrieval package with string format of “YYYY-MM-DD” for year (YYYY), month (MM), and day (DD) respectively padded by zeros as needed;

flowlo

Optional lower streamflow threshold on which to convert to NA (to keep continuous day time stamps) through Q > flowlo \rightarrow NA;

flowhi

Optional upper streamflow threshold on which to convert to NA (to keep continuous day time stamps) through Q < flowhi \rightarrow NA;

date2s

An optional start date of record (greater than and equal to) in the same format as sdate. This option is only used after the readNWISdv() function has retrieved the data; this option is provided for accommodating situations in which readNWISdv() has difficulties in pulling the correct data. This option is provided on an experimental basis because the author has become aware that sites with multiple daily-streamflow data descriptors do not have similar behavior when sdate is attempted for the start of record. The issues causing the inclusion of this option otherwise difficult to explain—please contact the authors as needed.

date2e

An optional ending date of record (less than and equal to) in the same format as edate. This option is only used after the readNWISdv() function has retrieved the data; this option is provided for accommodating situations in which readNWISdv() has difficulties in pulling the correct data. This option is provided on an experimental basis because the author has become aware that sites with multiple daily-streamflow data descriptors do not have similar behavior when edate is attempted for the start of record. The issues causing the inclusion of this option otherwise difficult to explain—please contact the authors as needed.

ignore.working

The USGS identifies at least “Approved” (A), “Provisional” (P), and “Working” (W) record types. The default triggers the deletion of data rows flagged as working record;

ignore.provisional

The USGS identifies at least “Approved” (A), “Provisional” (P), and “Working” (W) record types. The default triggers the deletion of data rows flagged as provisional record;

silent

Suppress informative calls to message() with the message;

drsilent

The argument though converted to “silent” that is passed to try for a level of informative error trapping on top of dataRetrieval::readNWISdv();

drget

If set, the retrieval from dataRetrieval::readNWISdv() is immediately returned following the internal call to dataRetrieval::renameNWISColumns();

pCode

Parameter code (default is discharge [streamflow]) and nomenclature almost matches that of the dataRetrieval package;

sCode

Statistic code (default is daily mean) and nomenclature almost matches that of the dataRetrieval package

message

An optional string that if populated will trigger a message() call that a user might find useful in massive batch processing operations; and

...

Additional arguments to pass to function dataRetrieval::readNWISdv().

Value

An R data.frame is returned with these expected columns. Some streamgages can have multiple discharges descriptors typed as daily values. Only the Flow column is handled by akqdecay.

agency_cd

The agency code for the data;

site_no

The streamgage identification number;

Date

The date, note that the capitalization is from the dataRetrieval package, elsewhere in akqdecay this will become lower case;

Flow

The streamflow in cubic feet per second, note that the capitalization is from the dataRetrieval package, elsewhere in akqdecay this will become lower case or an alternative name for streamflow;

Flow_cd

A coding system for the streamflow (A, approved record; P, provisional, and W, working record);

site

A character representation of the site_no, which likely is already a character. The reasoning for having another site column is in case a user need some type of flexibility in later processing. The user could freely replace contents of this column;

year

The calendar year of the date;

decade

The decade of the daily value. The decade is assign by taking the year and the trailing digit has been stripped and replaced with zero. This is not a technique in which a “decade” is centered on an even step of 10—meaning, say that 1996–2005 is not the “2000 decade” but simply 01/01/2000–12/31/2009 is the “2000 decade;”

wyear

The water year; and

month

The month.

Note

For the greater purposes of the akqdecay package, the arguments pCode and sCode for their defaults are expected to be left untouched. The capitalization inconsistency in the returned R data.frame is left intact as it is consistent with the operation of the dataRetrieval::renameNWISColumns() function that is called internally. Lastly, at least one streamgage (07040000 in 2015 [as for Nov. 2017 testing]) has been found in massive-scale testing that has -999999 for a daily flow. Such values are converted to NA.

Author(s)

W.H. Asquith

References

Hirsch, R.M., and De Cicco, L.A., 2015, User guide to Exploration and Graphics for RivEr Trends (EGRET) and dataRetrieval: R packages for hydrologic data (version 2.0, February 2015): U.S. Geological Survey Techniques and Methods book 4, chap. A10, 93 p., http://doi.org/10.3133/tm4A10.

See Also

akqdecay, fill_dvenv, gsid2str

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# USGS 14362000 Applegate River near Copper, Oregon
Copper <- dvget("14362000", sdate="1940-01-01", edate="1940-01-31")
print(Copper) # An inspection of the retrieved daily values. #

## Not run: 
dv <- dvget("07040000") # 2019-05-05 testing. This is how we can have a look
# at the streamgage. The Internet indicates gage discontinued in 2011 but
# -999999 start showing up in May 2016---Is this related to the AQ database change?
attributes(dv)$akqdecay
# [1] "at least one -999999 discharge: first=2016-05-05 and last=2019-05-04"
# Test on June 29, 2020 shows this message or issue of -999999 appears gone. 
## End(Not run)

## Not run: 
# This is a big time sink so treated as a "dontrun."
# Get all of the sites in Alabama that have discharge (00060) and then
# just work on those that seem to have record after June 1st, 2019.
AL <- dataRetrieval::whatNWISsites(stateCd="AL", parameterCd="00060")
AL <- AL[AL$site_tp_cd == "ST",]; # isolate just the streamgages, then remove
AL <- AL[as.numeric(AL$site_no) <= 100000000000000,] # lat/long based site numbers
sites <- AL$site_no
DV <- new.env(); n <- length(sites); i <- 0
for(site in sites) {
  i <- i + 1
  message("working on site ",site, "  ",i,"(",n,")")
  dv <- NULL
  try(dv <- dvget(site, sdate="2019-06-01", ignore.provisional=FALSE))
  if(is.null(dv)) next; if(length(dv$site_no) == 0) next
  assign(site, dv, envir=DV)
}#
## End(Not run)

wasquith-usgs/akqdecay documentation built on Nov. 9, 2020, 1:13 p.m.