knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

registr

This is an unofficial client for the GOV.UK Registers API.

Registers are authoritative lists of things, built and maintained by the UK government, for example, the country register is a list of countries.

It doesn't really wrap the API. Instead, it downloads the 'raw' registers in RSF (Register Serialisation Format -- not yet documented publicly), and parses that.

Installation

# install.packages("devtools") # if you don't already have devtools installed
devtools::install_github("nacnudus/registr")

To install a very early version of the package for running old scripts:

devtools::install_github("nacnudus/registr",
                         ref = "08c42c95bc65a0cb8131100416a372660b8a1bd5")

Examples

library(registr)

Download registers

Download a single register.

country <- rr_register("country")

Download all registers.

registers <- rr_registers(quiet = TRUE)
names(registers)

By default, the 'beta' ('ready to use') versions of registers are downloaded. If you need alpha ('open for feedback') registers, use phase = "alpha")

Explore register schema and data

The schema and data are in $shema and $data.

country <- registers$country
country$schema
country$data

You probably want to take a snapshot first. This will take the latest version of the schema, and the latest version of each record (e.g. the most recent name of a country)

country$schema$custodian
rr_snapshot(country)$schema$custodian

Each field of each entry can contain more than one value, if the field has the property cardinality = 'n'. In this case, the field is a list-column, where each value is a vector of values.

Linked registers

Registers link in two ways.

rr_links(registers$`statistical-geography`)
rr_key_links(registers$`statistical-geography`)
rr_snapshot(registers$`statistical-geography`)$schema$names

Resolve links with the rr_resolve_*() family of functions. Because links refer to whole records, whole records are returned in a list-column of data frames.

If a matching record has multiple entries, every entry is returned in a multi-row data frame.

If a linking field is cardinality = 'n', a list of data frames is returned.

rr_resolve_links(registers$`statistical-geography`, registers)$data

You can resolve to only the latest entry of each record by creating a registers object with snapshots.

registers_snapshot <- purrr::map(registers, rr_snapshot)
rr_resolve_links(registers$`statistical-geography`, registers_snapshot)$data

Plot the links between registers with something like the ggraph package.

library(tidygraph)
library(ggraph)

registers$`statistical-geography` %>%
  rr_links() %>%
  as_tbl_graph() %>%
  ggraph(layout = "nicely") +
    geom_edge_fan(aes(alpha = ..index..), show.legend = FALSE) +
    geom_edge_loop() +
    geom_node_label(aes(label = name)) +
    theme_void()

edge_arrow <- arrow(length = unit(4, "mm"), type = "closed")
registers %>%
  rr_links() %>%
  dplyr::distinct(from, to, type) %>%
  as_tbl_graph() %>%
  ggraph(layout = "nicely") +
    geom_node_point() +
    geom_edge_fan(aes(colour = type),
                  arrow = edge_arrow,
                  end_cap = circle(2, 'mm')) +
    geom_edge_loop(aes(colour = type),
                   arrow = edge_arrow,
                   end_cap = circle(2, 'mm')) +
    geom_node_label(aes(label = name), repel = TRUE, alpha = .5) +
    theme_void()

Index registers

You can index registers by any column, using CURIE-like syntax.

country <- registers$country
rr_index(country, "start-date")
rr_index(country, "end-date")
rr_index(country)
rr_index(registers$`local-authority-eng`, "local-authority-type")


nacnudus/registr documentation built on May 5, 2019, 12:31 p.m.