get_eurostat_toc: Download Table of Contents of Eurostat Data Sets

View source: R/get_eurostat_toc.R

get_eurostat_tocR Documentation

Download Table of Contents of Eurostat Data Sets

Description

Download table of contents (TOC) of eurostat datasets.

Usage

get_eurostat_toc(lang = "en")

Arguments

lang

2-letter language code, default is "en" (English), other options are "fr" (French) and "de" (German). Used for labeling datasets.

Details

In the downloaded Eurostat Table of Contents the 'code' column values are refer to the function 'id' that is used as an argument in certain functions when downloading datasets.

Value

A tibble with nine columns:

title

Dataset title in English (default)

code

Each item (dataset, table and folder) of the TOC has a unique code which allows it to be identified in the API. Used in the get_eurostat() and get_eurostat_raw() functions to retrieve datasets.

type

dataset, folder or table

last.update.of.data

Date, indicates the last time the dataset/table was updated (format DD.MM.YYYY or ⁠%d.%m.%Y⁠)

last.table.structure.change

Date, indicates the last time the dataset/table structure was modified (format DD.MM.YYYY or ⁠%d.%m.%Y⁠)

data.start

Date of the oldest value included in the dataset (if available) (format usually YYYY or ⁠%Y⁠ but can also be YYYY-MM, YYYY-MM-DD, YYYY-SN, YYYY-QN etc.)

data.end

Date of the most recent value included in the dataset (if available) (format usually YYYY or ⁠%Y⁠ but can also be YYYY-MM, YYYY-MM-DD, YYYY-SN, YYYY-QN etc.)

values

Number of actual values included in the dataset

hierarchy

Hierarchy of the data navigation tree, represented in the original txt file by a 4-spaces indentation prefix in the title

Data source: Eurostat Table of Contents

The Eurostat Table of Contents (TOC) is downloaded from https://ec.europa.eu/eurostat/api/dissemination/catalogue/toc/txt?lang=en (default) or from French or German language variants: https://ec.europa.eu/eurostat/api/dissemination/catalogue/toc/txt?lang=fr https://ec.europa.eu/eurostat/api/dissemination/catalogue/toc/txt?lang=de

See Eurostat documentation on TOC items: https://wikis.ec.europa.eu/display/EUROSTATHELP/API+-+Detailed+guidelines+-+Catalogue+API+-+TOC

Author(s)

Przemyslaw Biecek, Leo Lahti and Pyry Kantanen ropengov-forum@googlegroups.com

References

See citation("eurostat"):

Kindly cite the eurostat R package as follows:

  Lahti L., Huovari J., Kainu M., and Biecek P. (2017). Retrieval and
  analysis of Eurostat open data with the eurostat package. The R
  Journal 9(1), pp. 385-392. doi: 10.32614/RJ-2017-019

A BibTeX entry for LaTeX users is

  @Article{10.32614/RJ-2017-019,
    title = {Retrieval and Analysis of Eurostat Open Data with the eurostat Package},
    author = {Leo Lahti and Janne Huovari and Markus Kainu and Przemyslaw Biecek},
    journal = {The R Journal},
    volume = {9},
    number = {1},
    pages = {385--392},
    year = {2017},
    doi = {10.32614/RJ-2017-019},
    url = {https://doi.org/10.32614/RJ-2017-019},
  }

  Lahti, L., Huovari J., Kainu M., Biecek P., Hernangomez D., Antal D.,
  and Kantanen P. (2023). eurostat: Tools for Eurostat Open Data
  [Computer software]. R package version 4.0.0.
  https://github.com/rOpenGov/eurostat

A BibTeX entry for LaTeX users is

  @Misc{eurostat,
    title = {eurostat: Tools for Eurostat Open Data},
    author = {Leo Lahti and Janne Huovari and Markus Kainu and Przemyslaw Biecek and Diego Hernangomez and Daniel Antal and Pyry Kantanen},
    url = {https://github.com/rOpenGov/eurostat},
    type = {Computer software},
    year = {2023},
    note = {R package version 4.0.0},
  }

When citing data downloaded from Eurostat, see section "Citing Eurostat data" in get_eurostat() documentation.

See Also

get_eurostat(), search_eurostat()

Examples



tmp <- get_eurostat_toc()
head(tmp)

# Convert columns containing dates as character into Date class
# Last update of data
tmp[[4]] <- as.Date(tmp[[4]], format = c("%d.%m.%Y"))
# Last table structure change
tmp[[5]] <- as.Date(tmp[[5]], format = c("%d.%m.%Y"))
# Data start, contains several formats (date, week, month quarter, semester)
# Unfortunately semesters are not directly supported so they need to be
# changed into quarters
tmp$data.start <- gsub("S2", "Q3", tmp$data.start)
tmp$data.start <- lubridate::as_date(
 x = tmp$data.start, 
 format = c("%Y", "%Y-Q%q", "%Y-W%W", "%Y-S%q", "%Y-%m-%d", "%Y-%m")
 )
# Data end, same as data start
tmp$data.end <- gsub("S2", "Q3", tmp$data.end)
tmp$data.end <- lubridate::as_date(
 x = tmp$data.end, 
 format = c("%Y", "%Y-Q%q", "%Y-W%W", "%Y-S%q", "%Y-%m-%d", "%Y-%m")
 )



rOpenGov/eurostat documentation built on Jan. 19, 2024, 11:45 a.m.