knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = TRUE, warning=FALSE, message=FALSE ) devtools::load_all()
Here are some basic comparisons between the intersectr
and the proposed zonal
workflows. zonal
was written to support ngen
catchment characteristic creation. Where it excels in in the ability to (1) execute on larger data sets with more intersection units in less time using less memory, (2) provider a simpler workflow (3) the ability to handle categorical data and (4) the ability to handel tif data.
Both workflows follow the same 3 steps of:
library(dplyr) library(data.table) library(RNetCDF) library(sf) library(ncmeta) library(zonal) library(intersectr)
The gridded data and aggregate units we are working with can be seen below:
file = '/Users/mjohnson/Downloads/pet_1979.nc' (meta = nc_dims(file))
The grid we are working with holds daily PET values for CONUS for the year 1979. It has:
- r prettyNum(filter(meta, name == 'lon')$length, big.mark = ",")
X_cells
- r prettyNum(filter(meta, name == 'lat')$length, big.mark = ",")
Y_cells
- r prettyNum(filter(meta, name == 'day')$length, big.mark = ",")
time slices
for a total of r prettyNum(prod(meta$length), big.mark = ",")
values.
geom <- read_sf('/Users/mjohnson/github/hydrofabric/workflow/nhd_workflows/cache/ngen_01a-4.gpkg', "catchments") %>% st_make_valid() glimpse(geom)
In total we have r prettyNum(nrow(geom), big.mark = ",")
aggregation units to summarize over the r filter(meta, name == 'day')$length
time steps.
The cell geometry workflow for intersectr
is wrapped into a single function and compared to the analogous function build_grid()
.
bnch <- bench::mark( iterations = 1, check = FALSE, intersectr = intersectr_cell_geom(file, geom), zonal = build_grid(file, geom) )
cell_geometry = intersectr_cell_geom(file, geom) grid = build_grid(file, geom) bnch %>% dplyr::select(expression, median, mem_alloc) %>% mutate(expression = names(expression), median_rel = unclass(median/min(median)), mem_rel = unclass(mem_alloc/min(mem_alloc))) %>% formattable::formattable()
The areal weights workflow for intersectr
is wrapped into a single function and compared to the analogous function in build_weights
.
bnch2 <- bench::mark( iterations = 1, check = FALSE, intersectr = intersectr_weights(cell_geometry , geom), zonal = build_weights(grid, geom, "comid") )
area_weights = intersectr_weights(cell_geometry , geom) w = build_weights(grid, geom, "comid") bnch2 %>% dplyr::select(expression, median, mem_alloc) %>% mutate(expression = names(expression), median_rel = unclass(median/min(median)), mem_rel = unclass(mem_alloc/min(mem_alloc))) %>% formattable::formattable()
The execution workflow for intersectr
is wrapped into a single function and compared to the analogous function in execute
.
bnch3 <- bench::mark( iterations = 1, check = FALSE, intersectr = intersectr_execute(file, "potential_evapotranspiration", cell_geometry, area_weights), zonal = execute(file, w) )
intersectr = intersectr_execute(file, "potential_evapotranspiration", cell_geometry, area_weights) zonal = execute(file, w) bnch3 %>% dplyr::select(expression, median, mem_alloc) %>% mutate(expression = names(expression), median_rel = unclass(median/min(median)), mem_rel = unclass(mem_alloc/min(mem_alloc))) %>% formattable::formattable()
intersectr
and zonal
output data in a different structures, intersectr
places time steps as rows, and zonal favors units as rows. The latter is easier to work with when the number of units > then the time steps while the former favor's succinct time series plotting. To make sure we get the same results, we transpose the zonal data to match the structure of the intersectr
.
# Intersectr dim(intersectr) # Zonal dim(zonal) system.time({ zonal2 = transpose(zonal, keep.names = "time_stamp", make.names = "comid") }) dim(zonal2)
And test a random catchment...
{plot(intersectr$`857`, zonal2$`857`, pch = 16, cex = .75, xlab = "intersectr (857)", ylab = "zonal (857)", main = "Comparison") abline(0,1, col = "red") }
Great they match r emo::ji('tada')
Above we seeintersectr
retained 16 catchments that zonal
lost To isolate these we can identify those not shared in each set and then map and check the results.
library(leaflet) (n = names(intersectr)[which(!names(intersectr) %in% names(zonal2))]) leaflet() %>% addTiles() %>% addPolygons(data = st_transform(filter(geom, comid %in% n ), 4326)) formattable::formattable(intersectr[1:5,n])
So these catchments are outside the GridMet domain and thus return NaN. Losing them is perfectly OK....
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.