Downloading data from SCB (Statistics Sweden) using JSON is rather simple in R. This packaage provides a function that creates a data_frame
from downloaded data in json-data format, as generated by the SCB site.
On the SCB website, save the data as a JSON-stat file (json)
, in the Save table as
listbox. See illustration below.
Note: Do not select the similarly named item JSON file (json)
at the end of the same list. After downloading the file, it can be converted into a dataframe using:
library(scbapi)
df <- scb_data(filename)
where filename
is the name given to the downloaded file.
To use, download the function in R/get_scb.R, or install the package using devtools:
install.packages("devtools")
devtools::install_github("bjornerstedt/scbapi")
Alternatively, the query can be saved as a JSON-file and the dataset downloaded later.
API for this table
located below the resulting table outputurl <- "http://api.scb.se/ ..."
get_scb()
. library(scbapi)
popurl <- "http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101A/BefolkningNy"
df <- get_scb("cpi", url = popurl )
The data is returned in long format. Use tidyr::spread
to reshape data.
The response has been changed from "px"
to "json-stat"
, as shown in the example below. The url can also be stored in the downloaded json-file by adding them, comma separated, as fields at the bottom:
{
"query": [
...
...
],
"response": {
"format": "json-stat"
},
"url": "http://api.scb.se/OV0104/v1/doris/sv/ssd/START/BE/BE0101/BE0101A/BefolkningNy",
"weburl": "http://www.statistikdatabasen.scb.se/pxweb/sv/ssd/START__BE__BE0101__BE0101A/BefolkningNy"
}
I have also added the field url
for the json search. The advantage of this approach is that the json file now contains all the information necessary to retreive the data:
library(scbapi)
df <- get_scb("cpi")
It is a little unfortunate that SCB does not put the url in the json-query. If they did, one could just download a file and execute the short command above to get the data as a data frame.
The current version does not have much error checking and makes some specific assumptions about the structure of the SCB output. The program has worked for me in the downloads I have needed to do, but it is not unreasonable that it will fail with some/much output. Specifically I have assumed that there is a time dimension in the returned data, and that this is the lowest index in the data vector returned as query$dataset$value
in the returned JSON data. It also has minimal error checking of input and the response from api.scb.se.
When I have some free time I will check the generality in the API. But since it works for me, it might take some time before I have some time.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.