knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  tidy = TRUE,
  fig.path = "man/figures/README-",
  out.width = "100%"
)

uktrade

lifecycle R-CMD-check

The goal of uktrade is to provide convenient wrapper functions for loading HMRC data in R.

Installation

You can install the latest version of uktrade from this GitHub repository using devtools:

``` {r install, eval = FALSE} devtools::install_github("pvdmeulen/uktrade")

## HMRC data

His Majesty's Revenue & Customs (HMRC) is the United Kingdom's customs authority. Data on UK trade is available on their [uktradeinfo.com](https://www.uktradeinfo.com/) website, which is collected through a combination of customs declarations and Intrastat surveys (in the case of [some EU trade](https://www.gov.uk/intrastat)).

In the beginning of 2021, the [uktradeinfo.com](https://www.uktradeinfo.com/) website was updated. This update also introduced an Application Programming Interface (API) for accessing bulk trade (and trader) data. For smaller trade data extracts (< 30,000 rows), the [online tool](https://www.uktradeinfo.com/trade-data/) may be sufficient.

For more information on the various API endpoints and options, see [HMRC's API documentation](https://www.uktradeinfo.com/api-documentation/). Note that this service and the website itself were in beta at the time of developing this package, and even now features may still be added or changed (e.g. the [maximum number of rows per page was increased](https://www.uktradeinfo.com/news/enhancement-made-to-uktradeinfo-api-service/) from 30,000 to 40,000 in December 2021).

## This package

This package contains four functions:

|Function         |Description                                            |Status                                         |
|-----------------|-------------------------------------------------------|-----------------------------------------------|
|`load_ots()`     |a function for loading Overseas Trade Statistics data  |:yellow_circle: experimental (use at own risk) |
|`load_rts()`     |a function for loading Regional Trade Statistics data  |:yellow_circle: experimental (use at own risk) |
|`load_trader()`  |a function for loading trader data                     |:red_circle: *planned*                         |
|`load_custom()`  |a function for loading a custom URL                    |:yellow_circle: experimental (use at own risk) |

All of these functions will output a `dataframe` object (or `tibble`) with the desired data, and are able to keep track of paginated results as well as the API request limit of 60 requests per minute.

The first three are convenient wrapper functions which should make loading basic datasets easier (with the assumption that further data manipulation will be done in R after loading the data). Of course, the API allows for extensive customisation in what data is obtained. For this purpose, `load_custom()` allows you to specify a custom URL instead. This allows you to be more specific in your request and/or summarise the data before it's loaded (and in the case of trader data, expand specific columns to get more information).

## Example using OTS and RTS data

These functions are convenient wrappers for loading trade data - either detailed UK trade data using OTS (by CN8 code), or regional UK trade data using RTS (by SITC2 code). These functions will load the raw data and optionally join results with lookups obtained from the API (using, for example, the `/Commodity` and `/Country` endpoints). This makes data easy to read by humans, but also larger (more columns containing text).

### OTS

Loading all UK trade of single malt Scotch whisky (CN8 code 22083030) and bottled gin (22085011) for 2019 is done like so:

```r
library(uktrade)
data <- load_ots(month = c(201901, 201912), commodity = c(22083030, 22085011), join_lookup = FALSE, output = "tibble")

# Results are now in a tibble (set output to "df" to obtain a dataframe):
data

Note that the month argument specifies a range in the form of c(min, max), while the commodity argument specifies a collection of commodities (not a range).

As you can see, results are not easily interpretable as they stand. The HMRC API also has lookups which can be loaded separately using load_custom(), or joined automatically:

# Load specific lookups separately:
commodity_lookup <- load_custom(endpoint = "Commodity")

# Or join automatically with the `join_lookup = TRUE` option:
data <- load_ots(month = c(201901, 201912), commodity = c(22083030, 22085011), join_lookup = TRUE)

data

Loading aggregate data (such as all spirits, HS4 code 2208) is possible too (in the background, this is loading commodity codes greater than or equal to 22080000 and less than or equal to 22089999). You can also see what URL the code is using by specifying print_url = TRUE:

data <- load_ots(month = c(201901, 201912), commodity = 2208, join_lookup = TRUE, print_url = TRUE)

data

When loading an HS2 code, or a SITC1 or SITC2 code, so-called Below Threshold Trade Allocation estimates are also loaded (for EU trade). These are, roughly, estimated values for those trades which fell below the Intrastat Survey threshold. At more detailed commodity levels, these estimates are excluded. BTTA trade estimates have different commodity codes (9-digit CN codes ending in 9999999, or 7-digit SITC codes ending in 99999):

data <- load_ots(month = c(202101, 202103), commodity = "03", join_lookup = TRUE, print_url = TRUE)

library(dplyr)
library(stringr)

data %>% 
  filter(stringr::str_detect(Sitc4Code, "-"))

Specifying commodity = NULL and SITC = NULL will load all commodities (this may take considerable time). This will also include so-called non-response estimates, which have negative commodity codes (and currently cannot be split by e.g. SITC2 or HS2). For example, we can load all exports to Australia in January 2019:

data <- load_ots(month = 201901, country = "AU", flow = 4, commodity = NULL, join_lookup = TRUE)

data

We can also use SITC codes - here, we load all beverage (SITC2 code 11) exports to Australia in January 2019:

data <- load_ots(month = 201901, country = "AU", flow = 4, sitc = "11", join_lookup = TRUE)

data

Note that SITC codes need to be in character format, and include any leading zeros. This is because, unlike HS or CN codes, odd SITC codes exist (e.g. SITC1). We can also select a selection of codes in a similar way to HS/CN codes:

data <- load_ots(month = 201901, country = "AU", flow = 4, sitc = c("0", "11"), join_lookup = TRUE)

data

RTS

Loading all UK regional trade of 2-digit SITC Divisions '00 - Live animals other than animals of division 03' to '11 - Beverages' for 2019 is done by specifying sitc = c(00, 11) (leading zeros are removed). This again specifies a range, similar to the month argument. This is done to avoid URLs being too long.

data <- load_rts(month = c(201901, 201912), sitc = c(00, 11), join_lookup = TRUE)

data

Note: where relevant, BTTA data is included in RTS data as well.

Example using trader data

Trader code is a work in progress...

Example using a custom URL

This example uses the load_custom() function to replicate the example given in the API documentation for finding all traders that exported ‘Live horses (excl. pure-bred for breeding)’ from the ‘CB8’ post code in 2019. This way of loading data is possible, but will require more data manipulation after obtaining the results. This function is also the workhorse (no pun intended) behind the other functions.

data <- load_custom(endpoint = "Commodity", custom_search = "?$filter=Hs6Code eq '010129'&$expand=Exports($filter=MonthId ge 201901 and MonthId le 201912 and startswith(Trader/PostCode, 'CB8'); $expand=Trader)", output = "tibble")

data

Note that the variables expanded in the API query, Exports and Trader, both need to be expanded as they are contained in a nested column. When unnested using tidyr::unnest(), we can see the final results. The second expanded variable (Trader) itself contains 8 columns (which are TraderId, CompanyName, five Address columns, and PostCode).

tidyr::unnest(data, Exports, names_repair = "unique")

MIT License

You're free to use this code and its associated documentation without restriction. This includes both personal and commercial use. However, there is no warranty provided or any liability on my part. See the LICENSE file for more information.



pvdmeulen/uktrade documentation built on Feb. 18, 2025, 7:54 p.m.