Geometry tables - read on-demand

Another vignette in this package sent us a file.

library(dplyr)
db <- src_sqlite("data-raw/maps_county.sqlite3", create = FALSE)
db

There are two tables in it, and luckily we know what to do.

tbl(db, "geom")

tbl(db, "metadata")

Let's choose "wyoming", with "big horn" and "johnson".

meta <- tbl(db, "metadata")

meta %>% filter(state == "wyoming", county %in% c("big horn", "johnson")) %>% 
  select(county, group) %>% inner_join(tbl(db, "geom"))

Now we have something we can work with.

Read on-demand

ggplot(
  meta %>% filter(state == "wyoming", county %in% c("big horn", "johnson")) %>% 
  select(county, group) %>% inner_join(tbl(db, "geom")) %>% 
    collect()

) + aes(x = long, y = lat, group = subregion, fill = subregion) + geom_polygon()

I'm interested now in the entire state.

ggplot(
  meta %>% filter(state == "wyoming") %>% 
  select(county, group) %>% inner_join(tbl(db, "geom")) %>% 
    collect() %>% group_by(subregion) %>% mutate(labx = mean(long), laby = mean(lat))

) + aes(x = long, y = lat, group = subregion, fill = subregion, label = subregion) + geom_polygon() + geom_path() + geom_text(aes(x = labx, y = laby)) +  
  scale_fill_grey(start = 0.3, guide = FALSE)

So, we have succesfully plotted our subset of polygons from a generic database containing two tables.

Can we go the full distance and create a Spatial object?

library(spbabel)
spwyoming <-   tbl(db, "metadata") %>% filter(state == "wyoming") %>% 
  select(county, group) %>% inner_join(tbl(db, "geom")) %>% 
    collect() %>% transmute(x_ = long, y_ = lat, branch_ = group, object_ = group, order_ = row_number(), island_ = TRUE, county = subregion, state = region) %>% sp(crs = "+proj=longlat +ellps=WGS84")

what did we get out the database?

geom <- tbl(db, "geom") %>% collect()
metadata <- tbl(db, "metadata") %>% collect()

nrow(geom)
nrow(metadata)

There are more than 85K rows for geometry, and 3000 rows for the metadata. We could save quite a bit by storing only the unique coordinates. There's also another opportunity, since we have to store the state many times for all out counties. Can't we store the states as an object, the county as a part, and store the name of the county on each county and the name of the state on each state? We ring our friend and tell them this great suggestion.

distinct(geom, long, lat)
distinct(metadata, state)

Obviously there's a lot of wasted coordinates here, so hopefully our friend figures out a more efficient storage in the db.

Simple features from normalized tables (mostly topology)



Try the spbabel package in your browser

Any scripts or data that you put into this service are public.

spbabel documentation built on March 31, 2023, 11:55 p.m.