here_path <- here::here() library(extrafont) loadfonts() library(ggplot2) theme_set(theme_minimal(base_family = "PT Sans"))
Data in the Eurostat database is stored in tables. Each table has an identifier, a short table_code, and a description (e.g. tps00199 - Total fertility rate).
Key eurostat functions allow to find the table_code, download the eurostat table and polish labels in the table.
The search_eurostat(pattern, ...) function scans the directory of Eurostat tables and returns codes and descriptions of tables that match pattern.
library("eurostat") query <- search_eurostat(pattern = "fertility rate", type = "table", fixed = FALSE) query[,1:2] ## title code ## <chr> <chr> ## Total fertility rate by NUTS 2 region tgs00100 ## Total fertility rate tps00199 ## Total fertility rate by NUTS 2 region tgs00100
The get_eurostat(id, time_format = "date", filters = "none", type = "code", cache = TRUE, ...) function downloads the requested table from the Eurostat bulk download facility or from The Eurostat Web Services JSON API (if filters are defined). Downloaded data is cached (if cache=TRUE). Additional arguments define how to read the time column (time_format) and if table dimensions shall be kept as codes or converted to labels (type).
ct <- c("AT","BE","BG","CH","CY","CZ","DE","DK","EE","EL","ES", "FI","FR","HR","HU","IE","IS","IT","LI","LT","LU","LV", "MT","NL","NO","PL","PT","RO","SE","SI","SK","UK") dat <- get_eurostat(id="tps00199", time_format="num", filters = list(geo = ct)) dat[1:2,] ## indic_de geo time values ## TOTFERRT AT 2006 1.41 ## TOTFERRT AT 2007 1.38
The label_eurostat(x, lang = "en", ...) gets definitions for Eurostat codes and replace them with labels in given language ("en", "fr" or "de")
dat <- label_eurostat(dat) dat[1:3,] ## indic_de geo time values ## <fct> <fct> <dbl> <dbl> ## Total fertility rate Austria 2006 1.41 ## Total fertility rate Austria 2007 1.38 ## Total fertility rate Austria 2008 1.42
The get_eurostat() function returns tibbles in the long format. Packages dplyr and tidyr are well suited to transform these objects. The ggplot2 package is well suited to plot these objects.
dat <- get_eurostat(id="tps00199", filters = list(geo = ct)) library(ggplot2) library(dplyr) ggplot(dat, aes(x = time, y = values, color = geo, label = geo)) + geom_line(alpha = .5) + geom_text(data = dat %>% group_by(geo) %>% filter(time == max(time)), size = 2.6) + theme(legend.position = "none") + labs(title = "Total fertility rate, 2006-2017", x = "Year", y = "%") -> p # save plot ggsave(filename = glue::glue("{here_path}/inst/extras/cheatsheet/lineplot.pdf"),plot = p, width = 4.84, height = 3.26, device = cairo_pdf)
dat_2015 <- dat %>% filter(time == "2015-01-01") ggplot(dat_2015, aes(x = reorder(geo, values), y = values)) + geom_col(color = "white", fill = "grey80") + theme(axis.text.x = element_text(size = 6)) + labs(title = "Total fertility rate, 2015", y = "%", x = NULL) -> p # save plot ggsave(filename = glue::glue("{here_path}/inst/extras/cheatsheet/barplot.pdf"),plot = p, width = 4.84, height = 3.26, device = cairo_pdf)
There are three function to work with geospatial data from GISCO. The get_eurostat_geospatial() returns spatial data as sf-object. Object can me merged with data.frames using dplyr::*_join()-functions. The cut_to_classes() is a wrapper for cut() - function and is used for categorizing data for maps with tidy labels.
mapdata <- get_eurostat_geospatial(nuts_level = 0) %>% right_join(dat_2015) %>% mutate(cat = cut_to_classes(values, n=4, decimals=1)) head(select(mapdata,geo,values,cat), 3) ## geo values cat geometry ## AT 1.49 1.5 ~< 1.6 MULTIPOLYGON (((15.54245 48... ## BE 1.70 1.6 ~< 1.8 MULTIPOLYGON (((5.10218 51.... ## BG 1.53 1.5 ~< 1.6 MULTIPOLYGON (((22.99717 43...
The sf-object returned are ready to be plotted with ggplot::geom_sf()-function.
ggplot(mapdata, aes(fill = cat)) + scale_fill_brewer(palette = "RdYlBu") + geom_sf(color = alpha("white",1/3), alpha = .6) + xlim(c(-12,37)) + ylim(c(35,70)) + labs(title = "Total fertility rate, 2015", subtitle = "Avg. number of life births per woman", fill = "%") -> p # save plot ggsave(filename = glue::glue("{here_path}/inst/extras/cheatsheet/mapplot.pdf"),plot = p, width = 5.14, height = 4.26, device = cairo_pdf)
This onepager presents the eurostat package 2014-2019 Leo Lahti, Janne Huovari, Markus Kainu, Przemyslaw Biecek package version 3.3.55 URL: https://github.com/rOpenGov/eurostat
Retrieval and Analysis of Eurostat Open Data with the eurostat Package. Leo Lahti, Janne Huovari, Markus Kainu, and Przemysław Biecek. The R Journal, 9(1):385–392, 2017.
CC BY Przemyslaw Biecek & Markus Kainu https://creativecommons.org/licenses/by/4.0/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.