Introduction

The wateRuse package includes basic water-use data loading, plotting, and mapping functions that can be used to illustrate data, or perform QA/QC analysis on water-use data during compilation efforts. The package is designed to standarize and streamline USGS Water Science Center (WSC) and regional water use (WU) specialists' workflows.

WSC WU specialists can use current working AWUDS area files for their state (export file in .xlsx format) and national data files (.csv format), which includes historical approved datasets. wateRuse can be used to merge the working export and national approved data files. The merged file is the basis for subsequent plotting and analysis.

Regional water use specialists can import a national dump file that will include working and approved data.

Users can display water-use data for various data elements based on time (year) or areas (county, HUC8, aquifer, state). Time series plots us water-use values for a data element and multiple years. A data element would represent the amount of water that is used from a specific source (groundwater, surface water) for a given area (ie county) and purpose (public supply). Users can compare different data elements by area and year and produce scatter plots, bar charts, line graphs, and choropleth maps.

Importing data

A WSC user will navigate to a directory with a national best available dataset that contains all approved data that is viewable by all water-use users. In addition to this, a WSC user will execute an AWUDS export function to produce an excel file (using default AWUDS naming convention). These data files will be merged into one.

A Regional user will navigate to a directory with all national data, including best available and current working data for all States.

The ability to see data, whether national approved or state working data, will depend on the user type. WSC users will always see their own working unapproved data. Regional users will see a national dataset with approved and working data for review purposes.

Data analysis tools

This package provides basic data visualization tools that provide graphical displays of water-use data for different types of water uses by area and years.

Compare two years

The compare_two_years function enables the user to examine data for selected areas (counties, HUC8's, aquifers) for two years. This plot is produced as a scatterplot with a 1:1 line of equivalence to aid comparison. A comparison between two years for a single data element is graphed for an entire user-selected collection of areas (e.g., all counties in a state or all states in the US) or for a subset of areas (e.g., top twenty counties in a state or selected states in a region).

Users should be aware of what data elements exist for different years. For example, if a data element exists in one year but not both of the selected years, the plot will not be produced. FAQ: My compare two years plot doesn't have any data plotted, what's wrong? Answer: There may be data missing for the years selected.

[In shiny, to display the area identifier, hover the mouse-cursor over the plotted point.]

library(wateRuse)
w.use <- wUseSample
data.elements <- c("PS.TOPop", "PS.SWPop")
areas <- "15" # NA uses all areas
area.column <- "STATECODE"
year.x.y <- c(2005,2010)
compare_two_years(w.use, "PS.TOPop", year.x.y, area.column, areas)

Users can specify multiple data elements which results in a stacked group of plots, with one plot per data element. [In shiny, a single data element is selected to plot comparison of values for two different years.]

w.use <- wUseSample
data.elements <- c("PS.TOPop", "PS.SWPop")
areas <- "10" # NA uses all areas
area.column <- "STATECODE"
year.x.y <- c(2005,2010)
compare_two_years(w.use, data.elements, year.x.y, area.column, areas)

Multiple areas plot on the same plot.

w.use <- wUseSample
data.elements <- "PS.TOPop"
areas <- c("10","15") # NA uses all areas
area.column <- "STATECODE"
year.x.y <- c(2005,2010)
compare_two_years(w.use, data.elements, year.x.y, area.column, areas)

Compare two elements

This function produces a scatterplot for comparison of two data elements for selected areas in the same year. For example, the number of irrigated acres and total irrigated withdrawals for selected areas in a state for a given year can be plotted. Users can specify multiple years which results in a stacked group of plots, with one plot per year. [In shiny, users select a single year to make one scatterplot, and data-element values for all areas (counties) are shown.]

w.use <- wUseSample
data.elements.x.y <- c("TP.TotPop", "PS.WSWFr")
areas <- "15" # NA uses all areas
area.column <- "STATECODE"
years <- c(2000, 2005, 2010)
compare_two_elements(w.use, data.elements.x.y, years, area.column, areas)

Multiple areas are plotted together on the same plot.

w.use <- wUseSample
compare_two_elements(w.use, c("TP.TotPop","PS.WSWFr"), "2010", "STATECODE", c("10","15"))

Multi-Elements

Creates time series plots for one or more elements. Multiple data elements for a single area of interest (e.g. county) appear on the same plot. If more than one area is specified, multiple stacked plots are generated. [In shiny, only the first three areas are selected now to make three multi-data-element time-series plots. ??? Will an full areas pick list be implemented???]

w.use <- wUseSample
areas <- c("Kent County","Sussex County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
multi_element_data(w.use, data.elements, area.column = area.column,areas = areas)

Setting plot.points=FALSE will generate bar plots.

w.use <- wUseSample
areas <- c("Kent County","Sussex County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
multi_element_data(w.use, data.elements, plot.points = FALSE,
      area.column = area.column, areas = areas)

Box Plots

Creates box plots for one or more data elements by year. The values of data elements for a selected group of areas (counties, huc8's, aquifers) in a given year form each box. Multiple data elements are rendered as stacked plots (one plot per element). [In shiny, box plots are made for up to two selected data elements at a time. Users can select areas (counties) in a given state in the pick list on the left to include in the sample. Notches that do not overlap imply statistically significant differences in median data-element values between years. Limits on the time period plotted on the x-axis are not available in version 1.0.0 at present. ]

w.use <- wUseSample
areas <- "15"
area.column = "STATECODE"
data.elements <- c("PS.GWPop","TP.TotPop")
boxplot_wu(w.use, data.elements, area.column = area.column,areas = areas)

If areas = NA then all areas are included in the plotted data. If log = TRUE the axis is changed to a log scale.

w.use <- wUseSample
areas <- "15"
area.column = "STATECODE"
data.elements <- c("PS.GWPop","TP.TotPop")
boxplot_wu(w.use, data.elements, area.column, areas=NA, log=TRUE)

Time series plots

A time series plot shows the value of a data element for all areas (counties, huc8's, or aquifers) or a subset of areas over time. A user has the option to plot using a bar graph or line graph, or point plot. A single data element is displayed per plot, but multiple plots are generated when multiple data elements are specified. [In shiny, a single selected data element is plotted for areas (counties) selected from the pick list on left. (Deselect all, then select desired counties and update graph by clicking "Click Here to Swith Areas.") Limits on the time period plotted on the x-axis and pick lists for huc8's and aquifers are not available in version 1.0.0 at present. ]

df <- wUseSample
areas <- c("Kent County","Sussex County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
time_series_data(w.use, data.elements, area.column = area.column,areas = areas)

Setting plot.points = FALSE will change the format to a bar plot.

df <- wUseSample
areas <- c("Kent County","Sussex County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
time_series_data(w.use, data.elements, plot.points = FALSE, area.column = area.column,areas = areas)

If years are not specified, all years in the dataset are plotted. In the example below, years are specified to limit the range of the horizontal axis.

df <- wUseSample
areas <- c("Hawaii County","Honolulu County","Kauai County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
time_series_data(w.use, data.elements, plot.points = FALSE,
       area.column = area.column, areas = areas, legend=TRUE, years = c("1995", "2000", "2005"))

The default range of the y-axis can be over-ridden by a user-specified range.

df <- wUseSample
areas <- c("Hawaii County","Honolulu County","Kauai County")
area.column = "COUNTYNAME"
data.elements <- c("PS.GWPop","TP.TotPop")
time_series_data(w.use, data.elements, area.column, areas = areas, plot.points = FALSE, y.scale = c(0,1000))

Bar Sums

This function is similar to the time series function, as it creates clustered bar charts for one or more water-use withdrawal data elements and years, and it will generate stacked bar plots comprising all data elements specified for each year. Multiple stacked or clustered bar charts are generated if more than one area (e.g. county) is specified (one plot per area) in RStudio. [In shiny, annual sums of withdrawals of 8 major categories (public supply, domestic, industrial, thermoelectric, mining, livestock, aquaculture, and irrigation) and total groundwater and surface-water withdrawals are calculated for a state, after setting all the NA's (missing values) to zero in the AWUDS dump file from NWIS. The category and GW/SW total annual state withdrawals are plotted in clusetered or stacked bar plots by years. Only state annual category and GW/SW totals are plotted over time in shiny in version 1.0.0. (Totals by huc8 or aquifer not available in version 1.0.0.)]

w.use <- wUseSample
areas <- c("New Castle County", "Kent County")
area.column = "COUNTYNAME"
data.elements <- c("PS.WTotl","CO.WTotl","DO.WTotl","IN.WTotl","PF.WTotl")
barchart_sums(w.use, data.elements, area.column = area.column,areas = areas)

The data elements can be unstacked by setting plot.stack = FALSE.

w.use <- wUseSample
areas <- c("New Castle County", "Kent County")
area.column = "COUNTYNAME"
data.elements <- c("PS.WTotl","CO.WTotl","DO.WTotl","IN.WTotl","PF.WTotl")
barchart_sums(w.use, data.elements, plot.stack = FALSE, area.column = area.column,areas = areas)

The default range of the y-axis scale can be over-ridden by the user with y.scale.

w.use <- wUseSample
areas <- c("New Castle County", "Kent County")
area.column = "COUNTYNAME"
data.elements <- c("PS.WTotl","CO.WTotl","DO.WTotl","IN.WTotl","PF.WTotl")
barchart_sums(w.use, data.elements, area.column, areas=areas, y.scale = c(0,500))

Ranked data element tables

Users can rank data elements using the shiny application. Once data is loaded, users can selected areas, years and data element types to display tables of values, which can be displayed in various ranking forms by clicking on the column name.

Choropleth

This function generates a map of counties or huc8's shaded on a scaled color gradient representing the values of a data element and year of interest. Light shading corresponds to low values, dark shading corresponds to high values.

w.use <- wUseSample
data.elements <- "PS.WFrTo"
year <- 2010 
areas <- "Hawaii"
ch.plot <- choropleth_plot(w.use, data.elements, year, areas)

The primary data element can be normalized to a secondary data element. The example below normalizes fresh water public supply by total population.

w.use <- wUseSample
data.elements <- "PS.WFrTo"
norm.element <- "PS.TOPop"
year <- 2010 
areas <- "Hawaii" 
ch.plot <- choropleth_plot(w.use, data.elements, year, areas, norm.element)

Disclaimer

Software created by USGS employees along with contractors and grantees (unless specific stipulations are made in a contract or grant award) are to be released as Public Domain and free of copyright or license. Contributions of software components such as specific algorithms to existing software licensed through a third party are encouraged, but those contributions should be annotated as freely available in the Public Domain wherever possible. If USGS software uses existing licensed components, those licenses must be adhered to and redistributed.

This software has been approved for release by the U.S. Geological Survey (USGS). Although the software has been subjected to rigorous review, the USGS reserves the right to update the software as needed pursuant to further analysis and review. No warranty, expressed or implied, is made by the USGS or the U.S. Government as to the functionality of the software and related material nor shall the fact of release constitute any such warranty. Furthermore, the software is released on condition that neither the USGS nor the U.S. Government shall be held liable for any damages resulting from its authorized or unauthorized use.



USGS-R/AWUDS documentation built on May 9, 2019, 6:05 p.m.