knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
This is an unofficial client for the GOV.UK Registers API.
Registers are authoritative lists of things, built and maintained by the UK
government, for example, the country
register is a list of countries.
It doesn't really wrap the API. Instead, it downloads the 'raw' registers in RSF (Register Serialisation Format -- not yet documented publicly), and parses that.
# install.packages("devtools") # if you don't already have devtools installed devtools::install_github("nacnudus/registr")
To install a very early version of the package for running old scripts:
devtools::install_github("nacnudus/registr", ref = "08c42c95bc65a0cb8131100416a372660b8a1bd5")
library(registr)
Download a single register.
country <- rr_register("country")
Download all registers.
registers <- rr_registers(quiet = TRUE) names(registers)
By default, the 'beta' ('ready to use') versions of registers are downloaded.
If you need alpha ('open for feedback') registers, use phase = "alpha")
The schema and data are in $shema
and $data
.
country <- registers$country country$schema country$data
You probably want to take a snapshot first. This will take the latest version of the schema, and the latest version of each record (e.g. the most recent name of a country)
country$schema$custodian rr_snapshot(country)$schema$custodian
Each field of each entry can contain more than one value, if the field has the
property cardinality = 'n'
. In this case, the field is a list-column, where
each value is a vector of values.
Registers link in two ways.
"register"
property set in the schema. This is like a
foreign/primary key relationship in a relational database."prefix:reference"
, where prefix
is the name of
a register, and reference
is a value in the primary field of that register
(the field with the same name as the register).rr_links(registers$`statistical-geography`) rr_key_links(registers$`statistical-geography`) rr_snapshot(registers$`statistical-geography`)$schema$names
Resolve links with the rr_resolve_*()
family of functions. Because links
refer to whole records, whole records are returned in a list-column of data
frames.
If a matching record has multiple entries, every entry is returned in a multi-row data frame.
If a linking field is cardinality = 'n'
, a list of data frames is returned.
rr_resolve_links(registers$`statistical-geography`, registers)$data
You can resolve to only the latest entry of each record by creating a
registers
object with snapshots.
registers_snapshot <- purrr::map(registers, rr_snapshot) rr_resolve_links(registers$`statistical-geography`, registers_snapshot)$data
Plot the links between registers with something like the ggraph
package.
library(tidygraph) library(ggraph) registers$`statistical-geography` %>% rr_links() %>% as_tbl_graph() %>% ggraph(layout = "nicely") + geom_edge_fan(aes(alpha = ..index..), show.legend = FALSE) + geom_edge_loop() + geom_node_label(aes(label = name)) + theme_void() edge_arrow <- arrow(length = unit(4, "mm"), type = "closed") registers %>% rr_links() %>% dplyr::distinct(from, to, type) %>% as_tbl_graph() %>% ggraph(layout = "nicely") + geom_node_point() + geom_edge_fan(aes(colour = type), arrow = edge_arrow, end_cap = circle(2, 'mm')) + geom_edge_loop(aes(colour = type), arrow = edge_arrow, end_cap = circle(2, 'mm')) + geom_node_label(aes(label = name), repel = TRUE, alpha = .5) + theme_void()
You can index registers by any column, using CURIE-like syntax.
country <- registers$country rr_index(country, "start-date") rr_index(country, "end-date") rr_index(country) rr_index(registers$`local-authority-eng`, "local-authority-type")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.