How to use this package

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(eudata)
library(dplyr)
library(purrr)
library(ggplot2)

The data on GISCO is divided into topics.

get_topics()

Select the topics that you are interested in. Within each topic there are numerous files. These may differ in in the year they are associated with, spatial resolution, coordinate reference system, data format, among other things.

This package provides an easy access to the latest files. The example below selects the highest resolution file, where the coordinate system is the usual lat/long.

api <- get_topic("NUTS")

file_list <- get_latest_files(api)$gpkg |> 
  grep(pattern = "01M_.*_4326_", value = TRUE)

file_list

Be aware, that these files can be huge. The get_content_length function returns the size of a file without downloading it. It is not vectorized, so you have to use a map like construct if you have a list of files.

to_tibble <- function(x, column_name = "value") 
  tibble::tibble(names = names(x), `:=`(!!column_name, x))

file_sizes <-
  map_int(file_list, get_content_length, api = api) |>
  to_tibble(column_name = "size")
  # tibble::as_tibble_col()


file_sizes |>
  knitr::kable(
    format.args = list(big.mark = "_", scientific = FALSE)
  )

Suppose we selected a file to download. Then you can save it to a local file using the get_content function. It also save a copy into a cache under your cache folder. The place of this folder is OS dependent, use rappdirs::user_cache_dir("eudata") to locate it.

If you do not specify a dest file, the data will be downloaded into a temporary file. The path to this file is the body element of the result of the call.

file_to_download <- grep(pattern = "RG.*LEVL_3", file_list, value = TRUE)
file_to_download

result <- get_content(
  api = api,
  end_point = file_to_download,
  save_to_file = TRUE
)

result

The selected data format dpkg can be read into memory with the sf package. First only the first five records are shown.

db_file <- result$body 
layer <- sf::st_layers(db_file)
layer

sample <- sf::st_read(
  db_file,
  query = glue::glue("select * from \"{layer}\" limit 5")
)

sample

Once you have the structure of the database, it is easy to filter, for example, for Hungarian data only.

hu_data <- sf::st_read(
  db_file,
  query = glue::glue("select * from \"{layer}\" where CNTR_CODE = \"HU\"")
)

hu_data |>
  knitr::kable()

A map with ggplot2.

hu_data |>
  ggplot() +
  geom_sf()

Another example, now for postal codes.

api <- get_topic("Postal")

file_to_download <- grep("_4326", get_latest_files(api)$gpkg, value = TRUE) 

result <- get_content(api, file_to_download, save_to_file = TRUE)
result
db_file <- result$body
layer <- sf::st_layers(db_file)
layer

sample <- sf::st_read(
  db_file,
  query = glue::glue("select * from \"{layer}\" limit 5")
)

sample


hu_data <- sf::st_read(
  db_file,
  query = glue::glue("select * from \"{layer}\" where CNTR_ID = \"HU\"")
)

hu_data |> 
  select(POSTCODE, LAU_NAT)
EOV <- "EPSG:23700"

hu_data |>
  filter(grepl("^Gyöngyös$", LAU_NAT)) |>
  ggplot() +
  geom_sf() + 
  coord_sf(crs = EOV, datum = EOV)

Cities with the highest number of associated postal codes

hu_data |> 
  sf::st_drop_geometry() |>
  count(LAU_NAT) |>
  arrange(-n) |>
  filter(n > 1)


Try the eudata package in your browser

Any scripts or data that you put into this service are public.

eudata documentation built on Aug. 8, 2025, 7:22 p.m.