library(nhdplusTools) local <- (Sys.getenv("BUILD_VIGNETTES") == "TRUE") if(local) { cache_path <- file.path(nhdplusTools_data_dir(), "data_v_cache") } else { cache_path <- tempdir() } knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width=6, fig.height=4, eval=local, cache=local, cache.path=(cache_path) ) oldoption <- options(scipen = 9999)
Since it's original release in 2019, data access in nhdplusTools has evolved and grown considerably. This vignette, prepared for a workshop in Summer 2024 shows the diversity of data available with nhdplusTools in one holistic overview.
The function naming in nhdplusTools
related to data access uses three common terms:
1. "download" will generally download complete data sets for use locally.
2. "get" will pull a data subset from a web service.
3. "subset" will pull a subset from a local file and, when available, a web service.
The following summarizes specific data access functions available for the datasets that nhdplusTools
has supporting functionality for.
download_nhdplusv2()
- downloads a National seamless geodatabase get_nhdplus()
- access NHDPlusV2 from a web servicesubset_nhdplus()
- subsets the National geodatabase OR the web servicesubset_rpu()
- subsets loaded NHDPlusV2 by raster processing unitsubset_vpu()
- subsets loaded NHDPlusV2 by vector processing unitget_boundaries()
- retrieves NHDPlusV2 processing unit boundariesdownload_nhdplushr()
- downloads staged geodatabasesget_nhdplushr()
- assembles nhdplushr datasets from many staged databasesget_hr_data()
- helps pull data out of staged databases (used by get_nhdplushr()
)download_nhd()
- downloads staged geodatabasesget_3dhp
- access 3DHP from a web serviceget_geoconnex_reference()
- access geoconnex reference features from a web servicedownload_wbd()
- download a National WBD geodatabaseget_huc()
- get a particular Hydrologic Unit Code from a servicedownload_rf1()
- download a National RF1 geodatabaseThe Network Linked Data Index plays a key role in discovery nhdplusTools
. It provides access to an easy network navigation and basin boundary delineation tool. Behind the scenes, the NLDI is based on the network of the NHDPlusV2 -- so has some special functionality related directly to that dataset. Functions that utilize the NLDI include:
subset_nhdplus()
plot_nhdplus()
map_nhdplus()
discover_nhdplus_id()
get_nldi_basin()
get_nldi_feature()
get_nldi_index()
navigate_network()
navigate_nldi()
get_split_catchment()
get_raindrop_trace()
In the demo below, we'll choose a stream gage and show how to access data related to it from all the datasets that nhdplusTools works with as well as use some of the NLDI functionality in question.
For the sake of demonstration, we'll look at the Wolf River at Langlade, WI -- site ID "04074950".
nhdplusTools allows us to start from a stream gage to build a subset of NHDPlusV2 data. Below, we'll do just that and plot the results on a default map.
demo_dir <- nhdplusTools_data_dir() site_id <- "USGS-04074950" # use dataRetrieval::get_nldi_sources() to find other nldi sources site <- list(featureSource = "nwissite", featureID = "USGS-04074950") site_feature <- get_nldi_feature(site) upstream_network <- navigate_nldi(site, mode = "UT", distance_km = 9999) demo_data <- file.path(demo_dir, "data_demo.gpkg") dataset <- subset_nhdplus(as.integer(upstream_network$UT_flowlines$nhdplus_comid), nhdplus_data = "download", # download from a service output_file = demo_data, # write the data to disk return_data = TRUE, # return the data rather flowline_only = FALSE, overwrite = TRUE) names(dataset) sapply(dataset, nrow) old_par <- par(mar = c(0, 0, 0, 0)) plot_nhdplus(outlets = list(featureSource = "nwissite", featureID = "USGS-04074950"), nhdplus_data = demo_data, flowline_only = TRUE)
The above is the original way NHDPlusTools supported access NHDPlusV2 data. A dedicated web service subset utility is available in get_nhdplus
-- which is what subset_nhdplus()
calls behind the scenes.
Here we grab the basin for our site and request NHDPlus with its geometry as the Area of Interest.
basin <- get_nldi_basin(site) subset <- get_nhdplus(AOI = basin, realization = "flowline") par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(basin)) plot(sf::st_geometry(subset), col = "blue", add = TRUE)
For NHDPlusHR data, which is much denser than NHDPlusV2, nhdplusTools
supports downloading four-digit Hydrologic Unit Code staged geodatabases. The function get_huc()
is useful to discover the code needed here.
wolf_huc <- get_huc(basin, type = 'huc04') nrow(wolf_huc) # it straddles hucs? Not really. par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(basin), col = "grey") plot(sf::st_geometry(wolf_huc), add = TRUE) wolf_huc <- get_huc(site_feature, type = "huc04") nrow(wolf_huc) # better!! par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(wolf_huc)) plot(sf::st_geometry(basin), col = "grey", add = TRUE)
outdir <- file.path(nhdplusTools_data_dir(), "hr_access_demo") dir.create(outdir) download_dir <- download_nhdplushr(outdir, wolf_huc$huc4) list.files(download_dir)
If we had asked for more HUC4 codes, additional gdb files would be in the directory we specified.
With this, we can use one of the two functions for access NHDPlusHR data to load data from the directory. Here, we use the more complete get_nhdplus()
and set check_terminals=TRUE
which uses make_standalone()
to ensure that the nhdplus attributes are complete and self-consistent within the subset of data returned.
nhdplushr <- get_nhdplushr( download_dir, layers = c("NHDFlowline", "NHDPlusCatchment", "NHDWaterbody", "NHDArea", "NHDLine", "NHDPlusSink", "NHDPlusWall", "NHDPoint", "NHDPlusBurnWaterbody", "NHDPlusBurnLineEvent", "HYDRO_NET_Junctions", "WBDHU2", "WBDHU4","WBDHU6", "WBDHU8", "WBDHU10", "WBDHU12", "WBDLine"), check_terminals = TRUE) sapply(nhdplushr, nrow)
At a lower level, we can use get_hr_data()
to access particular layers. Here, rename=TRUE
causes the nhdplushr names to be normalized to nhdplusTools
conventions using align_nhdplus_names()
.
NOTE: "NHDPlusID" from nhdplushr is replaced with the name "COMID". This attribute is merely a unique integer identifier and should not be assumed to relate to anything outside the context of a given dataset.
nhdplushr <- get_hr_data(list.files(download_dir, pattern = ".gdb", full.names = TRUE), layer = "NHDFlowline", rename = TRUE) names(nhdplushr) # Great Lakes coast are flowlines -- remove for visuals gl_coast <- c(get_DM(nhdplushr, 60002700000331), get_DM(nhdplushr, 60002700049157)) plot_data <- dplyr::filter(nhdplushr, FCODE != 56600 & StreamOrde > 2 & !COMID %in% gl_coast) par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(wolf_huc)) plot(sf::st_geometry(basin), col = "grey", add = TRUE) plot(sf::st_geometry(plot_data), lwd = plot_data$StreamOrde / 3, col = "blue", add = TRUE) par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(basin), col = "grey") plot(sf::st_geometry(plot_data), lwd = plot_data$StreamOrde / 2, col = "blue", add = TRUE)
BONUS DEMO: Say we want to know where our stream gage is on the river in question...
get_flowline_index()
and disambiguate_flowline_indexes()
are your friends.
potential_matches <- get_flowline_index(nhdplushr, points = site_feature, search_radius = units::as_units(500, "m"), max_matches = 5) potential_matches site_meta <- dataRetrieval::readNWISsite(gsub("USGS-", "", site_feature$identifier)) sqmi_to_sqkm <- 2.59 da_df <- data.frame(id = 1, drainagearea = site_meta$drain_area_va * sqmi_to_sqkm) # uses nearest drainage area match disambiguate_flowline_indexes(potential_matches, flowpath = dplyr::select(nhdplushr, COMID, TotDASqKM), hydro_location = da_df)
If what you really need is the base NHD, which is now a static dataset, the pattern is just the same as with nhdplushr using download_nhd()
.
outdir <- file.path(nhdplusTools_data_dir(), "nhd_access_demo") dir.create(outdir) download_dir <- download_nhd(outdir, wolf_huc$huc4) list.files(download_dir) nhd_gdb <- list.files(download_dir, pattern = ".gdb", full.names = TRUE) sf::st_layers(nhd_gdb) nhd_fline <- sf::read_sf(nhd_gdb, "NHDFlowline")
We'll wait to plot this up until after we've done some work with 3DHP. As of writing, the 3DHP is more or less the same as the NHD but in a new database schema. The get_3dhp()
uses a web service to pull data for subsets much the same as the other get_*
functions. See vignette("get_3dhp_data.Rmd")
for more on how to work with 3DHP data.
sub_3dhp <- get_3dhp(basin, type = "flowline") plot(sub_3dhp) sub_3dhp <- st_compatibalize(sub_3dhp, nhd_fline) par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(basin), col = "grey") plot(sf::st_geometry(sf::st_zm(sub_3dhp)), col = "skyblue", lwd = 2.5, add = TRUE) plot(sf::st_geometry(sf::st_zm(nhd_fline)), col = "blue", lwd = 0.1, add = TRUE)
A brand new, May 2024, feature in nhdplusTools is access to the geoconnex.us reference feature server. The reference feature server provides easy access to representations of datasets that people use to cross reference other data. discover_geoconnex_reference()
provides access to a full table of what's available.
unique(discover_geoconnex_reference()[c("id", "title")])
Each of these sets of "reference features" includes a "uri" which is a persistent way to identify these features and will always lead you back to a representation of the feature when you look it up.
For example, let's get a subset of data for our Wolf River basin.
wolf_gages <- get_geoconnex_reference(basin, type = "gages") geoconnex_gage <- dplyr::filter(wolf_gages, provider_id == gsub("USGS-", "", site_feature$identifier)) wolf_mainstems <- get_geoconnex_reference(basin, type = "mainstems") wolf_mainstem <- dplyr::filter(wolf_mainstems, uri == geoconnex_gage$mainstem_uri) par(mar = c(0, 0, 0, 0)) plot(sf::st_geometry(basin), col = "grey") plot(sf::st_geometry(wolf_mainstems), lwd = 0.5, col = "blue", add = TRUE) plot(sf::st_geometry(wolf_mainstem), lwd = 3, col = "blue", add = TRUE) plot(sf::st_geometry(wolf_gages), pch = 2, cex = 0.75, add = TRUE) plot(sf::st_geometry(geoconnex_gage), pch = 17, cex = 1.5, add = TRUE)
The Watershed Boundary Dataset contains hydrologic units that are used as cataloging units for the country. It was developed and improved over the last twenty years and is nearing a complete static state. There are three prominent versions of it available -- a snapshot available as part of the NHDPlusV2, a snapshot available as part of \doi{10.5066/P92U7ZUT} and retrieved with get_huc()
, and the latest (soon to be final) snapshot available using download_wbd()
.
wbd_dir <- file.path(nhdplusTools_data_dir(), "wbd_access_demo") wbd_out <- download_wbd(wbd_dir) # zip::unzip doesn't always work if(length(wbd_out == 0)) { f <- list.files(wbd_dir, pattern = ".zip", full.names = TRUE) utils::unzip(f, exdir = wbd_dir) } wbd_gdb <- list.files(wbd_dir, pattern = ".gdb", full.names = TRUE) sf::st_layers(wbd_gdb)
options(oldoption) par(old_par) if(Sys.getenv("BUILD_VIGNETTES") != "TRUE") { unlink(work_dir, recursive = TRUE) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.