knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The French official open data portal offers a huge quantity of information. They also provide a well structured API. The BARIS package allows you to exploit this API in order to get the required data from the portal.
Within the portal there is the concept of a data set which contains one or several data frames or resources. So, if I use the resource term, you need to apprehend it as the data frame inside a data set.
The package is available on CRAN, you can also install the development version from Github:
install.packages("BARIS")
Too much talking, let's dive into a reproducible example.
The BARIS_search()
function allows you to search for a specified data set. A quick tip: within your query, use plain Nouns and avoid prepositions and determinants: le, la, de, des, en, à ... and so on :
library(BARIS) BARIS_search(query = "Monuments Historiques Marseille")
Cool we have our data set ... but wait it would be better to get some explanation about it.
The BARIS_explain()
function provides a description of a data set. The function takes one argument which is the ID of the data set:
BARIS_explain(datasetId = "5cebfa8306e3e77ffdb31ef5")
Don't panic if you're not a french speaker. You can always use the great googleLanguageR.
Now, it's time to list the resources contained within this data set !!!
The BARIS_resources
function displays the available resources or data frames within a data set. The function takes as argument the ID of the data set:
BARIS_resources(datasetId = "5cebfa8306e3e77ffdb31ef5")
You can see from above that the data set has two resources, a csv and a pdf. Now, we've reached the interesting part: extracting the data frame that you'll work on !
Using BARIS_extract()
you can extract directly into your R session the needed data set. Currently, “only” theses formats are supported: json, csv, xls, xlsx, xml, geojson and shp, nevertheless you can always rely on the url of the resource to download it manually.
In order to use the function you'll have to specify two arguments: The ID of the resource and its format.
You can visually catch the structure difference between the ID of a data set and the ID of a resource.
data <- BARIS_extract(resourceId = "59ea7bba-f38a-4d75-b85f-2d1955050e53", format = "csv") head(data)
End of the vignette.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.