knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The UK Department for Work and Pensions (DWP) operates the Stat-Xplore database, containing official statistics on benefit claims, caseloads, and other relevant data. The DWP also operates a public API service, which dwpstat
wraps. Full documentation for the DWP's API is available (here)[https://stat-xplore.dwp.gov.uk/webapi/online-help/Open-Data-API.html].
Users need to sign up for a Stat-Xplore account and create an API key to access the API. The API key can be set using dwp_api_key()
, which checks for an DWP_API_KEY
environmental variable, and allows users to set the key for a given session.
The API is rate limited, a limit that resets every hour. You can use dwp_rate_limit()
to check current usage rates.
library(dwpstat) dwp_rate_limit() # A tibble: 1 x 3 remaining reset limit <int> <dttm> <int> 9997 2018-10-01 14:18:42 10000
Use dwp_schema()
to identify available datasets, and the variables within those datasets. dwp_schema()
accepts ID variables at any level and returns all metadata at the level below. If dwp_schema()
is empty, returns all available data folders at the root level of the API.
Get a tibble
of all available folders in the API:
library(tibble) a <- dwp_schema() glimpse(a) Observations: 23 Variables: 4 $ id <chr> "str:folder:faa", "str:folder:fbc", "str:folder:fbencom", "str:folder:fbb", "str:folder:fca... $ label <chr> "Attendance Allowance", "Benefit Cap", "Benefit Combinations", "Bereavement Benefits", "Car... $ location <chr> "http://stat-xplore.dwp.gov.uk/webapi/rest/v1/schema/str:folder:faa", "http://stat-xplore.d... $ type <chr> "FOLDER", "FOLDER", "FOLDER", "FOLDER", "FOLDER", "FOLDER", "FOLDER", "FOLDER", "FOLDER", "...
Get a tibble
of metadata on all databases in the ESA folder
b <- dwp_schema("str:folder:fesa") glimpse(b) Observations: 1 Variables: 4 $ id <chr> "str:database:ESA_Caseload" $ label <chr> "ESA Caseload" $ location <chr> "http://stat-xplore.dwp.gov.uk/webapi/rest/v1/schema/str:database:ESA_Caseload" $ type <chr> "DATABASE"
Get a tibble
of all variables on all databases in the ESA folder:
c <- dwp_schema("str:database:ESA_Caseload") glimpse(c) Observations: 15 Variables: 4 $ id <chr> "str:count:ESA_Caseload:V_F_ESA", "str:measure:ESA_Caseload:V_F_ESA:CAWKLYAMT", "str:group:... $ label <chr> "Employment and Support Allowance Caseload", "Weekly Award Amount", "Admin-use only", "Quar... $ location <chr> "http://stat-xplore.dwp.gov.uk/webapi/rest/v1/schema/str:count:ESA_Caseload:V_F_ESA", "http... $ type <chr> "COUNT", "MEASURE", "GROUP", "FIELD", "GROUP", "FIELD", "FIELD", "FIELD", "FIELD", "FIELD",...
Given their ID, you can use dwp_scheme
to return the names of levels in group and field variables
d <- dwp_schema("str:field:ESA_Caseload:V_F_ESA:CTDURTN") glimpse(d) Observations: 1 Variables: 4 $ id <chr> "str:valueset:ESA_Caseload:V_F_ESA:CTDURTN:C_ESA_DURATION" $ label <chr> "Duration of current claim" $ location <chr> "http://stat-xplore.dwp.gov.uk/webapi/rest/v1/schema/str:valueset:ESA_Caseload:V_F_ESA:CTDU... $ type <chr> "VALUESET"
Returns a tibble
of levels for the duration options for ESA caseloads
e <- dwp_schema("str:valueset:ESA_Caseload:V_F_ESA:CTDURTN:C_ESA_DURATION") glimpse(e) Observations: 7 Variables: 4 $ id <chr> "str:value:ESA_Caseload:V_F_ESA:CTDURTN:C_ESA_DURATION:1", "str:value:ESA_Caseload:V_F_ESA:... $ label <chr> "Up to 3 months", "3 months up to 6 months", "6 months up to 1 year", "1 year and up to 2 y... $ location <chr> "http://stat-xplore.dwp.gov.uk/webapi/rest/v1/schema/str:value:ESA_Caseload:V_F_ESA:CTDURTN... $ type <chr> "VALUE", "VALUE", "VALUE", "VALUE", "VALUE", "VALUE", "VALUE"
Due to the structure of the API, actual data is returned in a list with six levels. The process for converting that data to a data frame or tibble varies considerably depending on the column
, row
and wafer
fields used, and so no generic function for conversion is provided at this time.
The data structure can be difficult to work with in R; it is often unclear how to match together data labels, in the fields
list level, with the actual data, in the cubes
level. The example below shows a relatively simple process of assigning names to a query.
library(purrr) pip <- dwp_get_data(database = "str:database:PIP_Monthly", measures = "str:count:PIP_Monthly:V_F_PIP_MONTHLY", column = "str:field:PIP_Monthly:V_F_PIP_MONTHLY:SEX", row = "str:field:PIP_Monthly:F_PIP_DATE:DATE2") pip2 <- as.data.frame(map(pip$cubes, "values")) pip_names <- pip$fields$items names(pip2) <- pip_names[[2]]$labels pip2$date <- pip_names[[1]]$labels glimpse(pip2) Observations: 64 Variables: 4 $ Male <dbl> 12, 124, 304, 920, 1710, 2714, 4291, 6580, 8922, 12354, 15710, 20238, 25162, 30... $ Female <dbl> 17, 118, 303, 940, 1746, 2756, 4579, 7280, 10087, 14275, 18384, 23902, 30006, 3... $ `Unknown or missing` <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... $ date <list> ["201304 (Apr-13)", "201305 (May-13)", "201306 (Jun-13)", "201307 (Jul-13)", "...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.