Why make an R package for FHRS data?

The FHRS is currently published as open data, in the form of individual XML files for each local authority, or available through our API. Neither of those are a totally straightforward way to get the data you want.

Enter fhrsdata. This package wraps useful API calls in R functions, which can be customised via the functions' parameters, to provide data extracts for a range of scenarios.

Installing the package

#install devtools if necessary
#install.packages("devtools")
devtools::install_github("helen-food/fhrsdata", quiet = TRUE)

Load it up

library(fhrsdata)

Fetching establishments

At the heart of the package is the get_establishments() function. This calls upon the API to return a tibble of establishments from the FHRS. You could call this function with no arguments and it would return all the establishments in the data, although it would take a very long time, as there are over half a million of them. Not recommended!

The chances are, you do not want everything in the FHRS. You may only want places people go to eat (restaurants, cafes, mobile caterers, etc.) and not all the schools and hospitals. Maybe you are only interested in retailers, or a particular retailer. Maybe you only want to know about a particular place. Maybe you only want to know how many pubs there are called "The Red Lion". All of these are possible with the get_establishments() function.

The name argument (NULL by default) is used to search for establishments by name. For example: Show me all the establishments with "red lion" in the name:

get_establishments(name = "red lion")

Almost 600 red lions! However, are these all pubs?

library(dplyr)
get_establishments("red lion") %>% count(BusinessType)

Some are not. What if you only want pubs? You would use the type argument. But first, you need to know the code for the type you want to select. You find this out using the get_business_types() function.

get_business_types()

From this we can see that the code for pubs is 7843. So let's add that to our API call:

get_establishments(name = "red lion", type = 7843)

Now only the pubs have been returned. The final customisation option is to filter the search by local authority. Out of interest, which local authorities have the most Red Lions?

get_establishments(name = "red lion", type = 7843) %>% 
  count(LocalAuthorityName) %>% 
  arrange(desc(n)) %>% 
  head()

And the winner is Powys. Let's say we wanted to further restrict our search for red lions just to Powys for some reason. First we need to find out the ID number for Powys in the data. We can do that using the get_authorities() function, which returns, you've guessed it, a list of all local authorities in the FHRS. That list is a bit longer than the list of business types, though. Fortunately we know that Powys is in Wales, and can search only for authorities in Wales using the country argument.

get_authorities(country = "Wales")

Or perhaps even better, just filter the authority search down to Powys.

get_authorities() %>% 
  filter(Name == "Powys")

To complete our slightly implausible quest for Red Lion pubs in Powys, the get_establishments() function is called as follows:

get_establishments(name="red lion", type = 7843, la = 349)

Finally, it is worth noting that the tibble returned by default from get_establishments() is a condensed version that includes only a selection of the (arguably most useful) fields. But if you want to make that decision for yourself, you can return everything by setting the condensed argument to FALSE.

get_establishments(name="red lion", type = 7843, la = 349, condensed = FALSE)


helen-food/fhrsdata documentation built on Aug. 21, 2019, 12:22 p.m.