ClockBoard: a zoning system for urban analysis

knitr::opts_chunk$set(
  eval = FALSE,
  echo = FALSE,
  collapse = TRUE,
  comment = "#>",
  message = FALSE
)
library(zonebuilder)
library(tmap)
library(dplyr)
# To uses josis template:
remotes::install_github("robinlovelace/rticles", ref = "josis")
refs = RefManageR::ReadZotero(group = "418217", .params = list(collection = "8S8LR8TK", limit = 100))
RefManageR::WriteBib(refs, "vignettes/references.bib")
# Set-up notes for pdf version
# convert .tex to .md
# system("pandoc -s -r latex paper.tex -o paper.md")
# # copy josis specific files and ignore them
# f = list.files("~/other-repos/rticles/inst/rmarkdown/templates/josis/skeleton", pattern = "josis", full.names = TRUE)
# file.copy(f, "vignettes")
rmarkdown::render(input = "vignettes/paper.Rmd", output_file = "../zonebuilder-paper.pdf")
browseURL("zonebuilder-paper.pdf")
piggyback::pb_upload("zonebuilder-paper.pdf")

Introduction

Zoning systems have long been used for practical purposes. They have been integral to land ownership, rents and urban policies for centuries, forming the basis of a range of social and economic practices. Historical examples highlighting the importance of zone layouts include 'tithe maps' determining land ownership and taxes in 18th Century England [@bryant_worcestershire_2007] and legally defined urban land use zones to tame chaotic urban growth expansion in the exploding US cities in the early 1900s [@baker_zoning_1925].

In the 19th Century, zoning systems became known for political reasons, with 'gerrymandering' entering public discourse and academic research following Elbridge Gerry's apparent attempt to gain political advantage by creating an electoral district in an odd shape that was said to resemble a salamander (hence the term's name) in 1812 [@orr_persistence_1969]. The gerrymandering problem has since been the topic of countless academic papers.

The gerrymandering problem (in itself is a manifestation of the modifiable area unit problem) can be described as a mathematical optimization problem: "$n$ units are grouped into $k$ zones such that some cost function is optimized, subject to constraints on the topology of the zones" [@chou_taming_2006]. In fact, this problem is concise definition of the broader "zoning problem" that starts from the assumption that zones are to be composed of one or more basic statistical units (BSUs). Although the range of outcomes is a finite combinatorial optimisation problem (which combination of BSU-zone aggregations satisfy/optimise some pre-determined criteria) the problem is still hard: "there are a tremendously large number of alternative partitions, a similar number of different results, and only a slightly smaller number of different interpretations" [@openshaw_optimal_1977].

The problem that we tackle in this paper is different, however: it is the division of geographic space into zones starting from a blank slate, without reference to pre-existing areal units. The focus of much preceding zoning research on BSU partitioning can be explained by the fact that much geographic data available to academics comes in 'pre-packaged' small areas and because creating zones from nothing is a harder problem. We disagree with the statement that "existence of individual or non-spatially aggregated data is rare in geography" [@openshaw_optimal_1977], pointing to car crashes, shop locations, species identification data and dozens of other phenomena that can be understood as 'point pattern processes'. And with advances in computer hardware and software, the 'starting from scratch' approach to zoning system is more feasible.

A number of approaches have tackled the question of how to best divide up geographical space for analysis and visualisation purposes, with a variety of applications. Functional zone classification is common in the field of remote sensing and associated sub-fields involved in analysing and classifying raster datasets [@ciglic_evaluating_2019; @hesselbarth_landscapemetrics_2019]. While such pixel-based approaches can yield complex and flexible results (depending on the geographic resolution of the input data), they are still constrained by the building blocks of the pixels, which can be seen as a particular type of areal unit, a uniformly sized and shaped BSU.

In this paper we are interested in the division of continuous space into completely new areal systems. This has been done using contour lines to represent lines of equal height, and the concept's generalisation to lines of equal journey time from locations (isochrones) [@long_modeling_2018], population density (isopleths) [@lin_cartographic_2017] and model parameters which continuous geographical space [@paez_exploring_2006]. The boundaries created by these various 'iso' maps are 'procedurally generated' areal units of the type that this paper focuses, but their variability and often irregular shapes make them impractical for many types of urban analysis.

Procedural generation, which involves the generation of data through a repeated and sometimes randomised computational process has long been used to represent physical phenomena [@onrust_ecologically_2017]. The approach has been used to generate spatial entities including roads [@galin_procedural_2010], indoor layouts of buildings [@anderson_augmented_2018] and urban layouts [@mustafa_procedural_2020]. Algorithms have also been developed to place linear features on a map, as illustrated by an algorithm that optimizes the placement of overlapping linear features for cartographic visualisation [@teulade-denantes_routes_2015]. However, no previous research has demonstrated the creation of zoning systems specifically for the purposes of urban analysis.

New visualisation techniques are needed to represent new (or newly quantifiable) concepts and emerging datasets (such as OpenStreetMap) in urban analysis analysis . The visualisation of direction has been driven by new navigational requirements and datasets, with circular compasses and displays common in land and sea navigational systems since the mid 1900s [@honick_pictorial_1967]. Circular visualisation techniques, in the form of rose diagrams, were used in a more recent study to indicate the most common road directions relative to North [@boeing_spatial_2021]. The resulting visualisations are attractive and easy to interpret, but are not geographical, in the sense that they cannot meaningfully be overlaid on mapped data. The approach we present in this paper is more closely analogous to 'grid sample' approaches used in ecological and population research [@hirzel_which_2002] . Historically, environmental researchers have used rectangular (and usually square) grids to divide up space and decide sampling strategies. Limitations associated with this simplistic strategy have been documented since at least the 1960s, with a prominent paper on geographic sampling strategies outlining advantages and disadvantages of simple random, systematic and stratified sampling techniques in 1967 [@holmes_problems_1967]. Starting with data at the level of raster grid cells and BSUs, a related approach is to sample from within available 'pixels' to generate a representative sample [@thomson_gridsample_2017].

Unlike BSU based zoning systems, grid sampling strategies require no prior zones. Unlike 'procedurally generated' areas, grid-based strategies generate areal units of consistent sizes and shapes. However, grid-based strategies are limited in their applicability to urban research because they seldom generate geographically contiguous results and do not account for the strong tendency of human settlements to have a (more-or-less clearly demarcated) central location with higher levels of activity.

Pre-existing zoning systems are often based on administrative regions. Although those zoning systems are usually in line with the hierarchical organization structure of governmental organizations, and therefore may work well for policy making, there are a couple of downsides to using such zoning systems. First of all, since a city and its politics change over time, the administrative regions often change accordingly. This make it harder to do time series analysis. Since the administrative regions have heterogeneous characteristics, for instance population size, area size, proximity to the city centre, comparing different administrative regions within a city is not straightforward. Moreover, comparing administrative regions across cities is even more challenging since average scale of an administrative region may vary a lot across cities.

Grid tiles are popular in spatial statistics for a number of reasons. Most importantly the tiles have a constant area size, which makes comparably possible. Moreover, the grid tiles will not change over time like administrative regions. However, one downside is that a grid requires a coordinate reference system (CRS), enforcing (approximately) equal area size. For continents or large countries, a CRS is always a compromise. Therefore, the areas of the tiles may vary, or the shape of the tiles may be sheared or warped.

Another downside from a statistical point of view is that population densities are not uniform within a urban area, but concentrated around a centre. As a consequence, high resolution statistics is preferable in the dense areas, i.e. the centre, and lower resolution statistics in other parts of the city. That is the reason why administrative regions are often smaller in dense areas.

The approach presented in this paper aims to minimise input data requirements, generate consistent zones comparable between widely varying urban systems, and provide geographically contiguous areal units. The motivations for generating a new zoning system and use cases envisioned include:

The paper is structured as follows. The next section outlines the approach, which requires only 2 inputs: the coordinates of the central place in the urban system under investigation, and the minimum radius from that central point that the zoning system should extend. Section 3 describes a number of potential applications, ranging from rudimentary navigation and location identification to mobility analysis. Finally, in Section 4, we discuss limitations of the approach and possible directions of research and development to generate additional zoning systems for urban analysis.

The ClockBoard zoning system

The aim of the ClockBoard zoning system outlined in this paper is to tackle the issues outlined above, with an approach that is free, open, reproducible and easy to extend. Specifically, we developed the system to considering urban analysis research and visualisation requirements, leading to the following high-level criteria. Zoning systems for urban analysis should:

After a process of iteration in which we considered many zoning options (some of which are illustrated in Figure \@ref(fig:options)) we settled on a system that we have called 'ClockBoard' (for reasons that will become apparent) and which has the following characteristics. The zoning system is based on concentric rings --- formally called 'concentric annuli' --- which emphasise central locations and have been used to explore the relationships between the characteristics of 'focal trees' and surrounding trees in ecological research [@wills_persistence_2016], as shown in Figure \@ref(fig:options) (a). The zoning system is based on segments defined by radial lines emanating from the central point of the settlement (or other geographic entity) to be divided into zones, as shown in Figure \@ref(fig:options) (b). From that point, we experimented with a range of ways of dividing the concentric annuli into different zones by modifying the distances between rings (the annuli borders) and the number of segments per annulus (see Figure \@ref(fig:options) c). It became apparent that zoning systems based on the two organising principles (and modifiable parameters) of concentric annuli and segments held promise, but selecting appropriate settings for each was key to the development of a zoning system that would meet the criteria outlined above.

# z1 = zb_zone(x = london_c(), n_segments = 1)
# m1 = qtm(z1, title = "(A) Concentric Annuli")
# sf::sf_use_s2(use_s2 = FALSE)
z1 = zb_zone(london_c(), n_segments = 1, distance_growth = 0)
z1_areas = sf::st_area(z1)
z1_areas_relative = as.numeric(z1_areas / z1_areas[1])
zb_plot(z1, title = "(A) Concentric Annuli")
zb_plot(zb_segment(london_c(), n_segments = 12), title = "(B) Clock segments")
qtm(zb_zone(london_c(), n_segments = z1_areas_relative, labeling = "clock", distance_growth = 0), title = "(C) Equal area zones", fill = NULL)

Annuli distances

The radius of each annuli in the zoning system can be incremented by a fixed amount, as shown in previous figures. In cases where high geographic resolution is important near the centre of the study region, such as when designing transport systems into the central zone of a city planning, increasing distances between each radius may be desirable. We experimented with various ways of incrementing the annuli width and suggest linear increases in width as a sensible default for a simple zoning system. This linear growth leads to distances between each annuli boundary increasing in line with the steps in the triangular number sequence [@ross_dicuil_2019], as outlined in Table \@ref(tab:t1).

txt = "Number of rings,Diameter across (km),Area (sq km)
1,1,2,3.14
2,2,6,28.27
3,3,12,113.10
4,4,20,314.16
5,5,30,706.86
6,6,42,1385.44
7,7,56,2463.01
8,8,72,4071.50
9,9,90,6361.73"
t1 = read.csv(text = txt, check.names = FALSE)
knitr::kable(t1, booktabs = TRUE, caption = "Key attributes of first 9 rings used in the ClockBoard zoning system.")

Number of segments

What it looks like with 4 segments. The ClockBoard zoning system has 12 segments, representing a compromise between specificity of zone identification and ease of comprehension (imagine a system with 256 segments and saying "I'm in zone E173"!) and understanding: the 12 segments of a clock face are well understood.

City extents

Applications

Navigation and location

Exploring city scale data

univariate description

- Population density in London - Social (e.g. religion) and demographic distributions

Inter-city statistical comparison

# download preprocessed data (processing script /data-raw/crashes.R)
df = readRDS(gzcon(url("https://github.com/zonebuilders/zonebuilder/releases/download/0.0.1/ksi_bkm_zone.rds")))
uk = readRDS(gzcon(url("https://github.com/zonebuilders/zonebuilder/releases/download/0.0.1/uk.rds")))
thames = readRDS(gzcon(url("https://github.com/zonebuilders/zonebuilder/releases/download/0.0.1/thames.rds")))

# filter: set zones with less than 10,000 km of cycling per yer to NA
df_filtered = df %>% 
  mutate(ksi_bkm = ifelse((bkm_yr * 1e09) < 2e04, NA, ksi_bkm))

tmap_mode("plot")
tm_shape(uk) +
 tm_fill(col = "white") +
tm_shape(df_filtered, is.master = TRUE) +
  tm_polygons("ksi_bkm", breaks = c(0, 1000, 2500, 5000, 7500, 12500), textNA = "Too little cycling", title = "Killed and seriously injured cyclists\nper billion cycled kilometers") +
  tm_facets(by = "city", ncol=4) +
  tm_shape(uk) +
  tm_borders(lwd = 1, col = "black", lty = 3) +
  tm_shape(thames) +
  tm_lines(lwd = 1, col = "black", lty = 3) +
  tm_layout(bg.color = "lightblue")

Mobility analysis

Discussion and conclusions

Pros:

Cons:

References



Try the zonebuilder package in your browser

Any scripts or data that you put into this service are public.

zonebuilder documentation built on July 13, 2021, 1:06 a.m.