oe_vectortranslate: Translate a .osm.pbf file into .gpkg format

View source: R/vectortranslate.R

oe_vectortranslateR Documentation

Translate a .osm.pbf file into .gpkg format

Description

This function is used to translate a .osm.pbf file into .gpkg format. The conversion is performed using ogr2ogr via the vectortranslate utility in sf::gdal_utils() . It was created following the suggestions of the maintainers of GDAL. See Details and Examples to understand the basic usage, and check the introductory vignette for more complex use-cases.

Usage

oe_vectortranslate(
  file_path,
  layer = "lines",
  vectortranslate_options = NULL,
  osmconf_ini = NULL,
  extra_tags = NULL,
  force_vectortranslate = FALSE,
  never_skip_vectortranslate = FALSE,
  boundary = NULL,
  boundary_type = c("spat", "clipsrc"),
  quiet = FALSE
)

Arguments

file_path

Character string representing the path of the input .pbf or .osm.pbf file.

layer

Which layer should be read in? Typically points, lines (the default), multilinestrings, multipolygons or other_relations. If you specify an ad-hoc query using the argument query (see introductory vignette and examples), then oe_get() and oe_read() will read the layer specified in the query and ignore layer argument. See also #122.

vectortranslate_options

Options passed to the sf::gdal_utils() argument options. Set by default. Check details in the introductory vignette and the help page of oe_vectortranslate().

osmconf_ini

The configuration file. See documentation at gdal.org. Check details in the introductory vignette and the help page of oe_vectortranslate(). Set by default.

extra_tags

Which additional columns, corresponding to OSM tags, should be in the resulting dataset? NULL by default. Check the introductory vignette and the help pages of oe_vectortranslate() and oe_get_keys(). Ignored when osmconf_ini is not NULL.

force_vectortranslate

Boolean. Force the original .pbf file to be translated into a .gpkg file, even if a .gpkg with the same name already exists? FALSE by default. If tags in extra_tags match data in previously translated .gpkg files no translation occurs (see #173 for details). Check the introductory vignette and the help page of oe_vectortranslate().

never_skip_vectortranslate

Boolean. This is used in case the user passed its own .ini file or vectortranslate options (since, in those case, it's too difficult to determine if an existing .gpkg file was generated following the same options.)

boundary

An sf/sfc/bbox object that will be used to create a spatial filter during the vectortranslate operations. The type of filter can be chosen using the argument boundary_type.

boundary_type

A character vector of length 1 specifying the type of spatial filter. The spat filter selects only those features that intersect a given area, while clipsrc also clips the geometries. Check the examples and also here for more details.

quiet

Boolean. If FALSE, the function prints informative messages. Starting from sf version 0.9.6, if quiet is equal to FALSE, then vectortranslate operations will display a progress bar.

Details

The new .gpkg file is created in the same directory as the input .osm.pbf file. The translation process is performed using the vectortranslate utility in sf::gdal_utils(). This operation can be customized in several ways modifying the parameters layer, extra_tags, osmconf_ini, vectortranslate_options, boundary and boundary_type.

The .osm.pbf files processed by GDAL are usually categorized into 5 layers, named points, lines, multilinestrings, multipolygons and other_relations. Check the first paragraphs here for more details. This function can covert only one layer at a time, and the parameter layer is used to specify which layer of the .osm.pbf file should be converted. Several layers with different names can be stored in the same .gpkg file. By default, the function will convert the lines layer (which is the most common one according to our experience).

The arguments osmconf_ini and extra_tags are used to modify how GDAL reads and processes a .osm.pbf file. More precisely, several operations that GDAL performs on the input .osm.pbf file are governed by a CONFIG file, that can be checked at the following link. The basic components of OSM data are called elements and they are divided into nodes, ways or relations, so, for example, the code at line 7 of that file is used to determine which ways are assumed to be polygons (according to the simple-feature definition of polygon) if they are closed. Moreover, OSM data is usually described using several tags, i.e pairs of two items: a key and a value. The code at lines 33, 53, 85, 103, and 121 is used to determine, for each layer, which tags should be explicitly reported as fields (while all the other tags are stored in the other_tags column). The parameter extra_tags is used to determine which extra tags (i.e. key/value pairs) should be added to the .gpkg file (other than the default ones).

By default, the vectortranslate operations are skipped if the function detects a file having the same path as the input file, .gpkg extension, a layer with the same name as the parameter layer and all extra_tags. In that case the function will simply return the path of the .gpkg file. This behaviour can be overwritten setting force_vectortranslate = TRUE. The vectortranslate operations are never skipped if osmconf_ini, vectortranslate_options, boundary or boundary_type arguments are not NULL.

The parameter osmconf_ini is used to pass your own CONFIG file in case you need more control over the GDAL operations. Check the package introductory vignette for an example. If osmconf_ini is equal to NULL (the default value), then the function uses the standard osmconf.ini file defined by GDAL (but for the extra tags).

The parameter vectortranslate_options is used to control the options that are passed to ogr2ogr via sf::gdal_utils() when converting between .osm.pbf and .gpkg formats. ogr2ogr can perform various operations during the conversion process, such as spatial filters or SQL queries. These operations can be tuned using the vectortranslate_options argument. If NULL (the default value), then vectortranslate_options is set equal to

c("-f", "GPKG", "-overwrite", "-oo", paste0("CONFIG_FILE=", osmconf_ini), "-lco", "GEOMETRY_NAME=geometry", layer).

Explanation:

  • ⁠"-f", "GPKG"⁠ says that the output format is GPKG;

  • ⁠"-overwrite⁠ is used to delete an existing layer and recreate it empty;

  • ⁠"-oo", paste0("CONFIG_FILE=", osmconf_ini)⁠ is used to set the Open Options for the .osm.pbf file and change the CONFIG file (in case the user asks for any extra tag or a totally different CONFIG file);

  • ⁠"-lco", "GEOMETRY_NAME=geometry"⁠ is used to change the layer creation options for the .gpkg file and modify the name of the geometry column;

  • layer indicates which layer should be converted.

If vectortranslate_options is not NULL, then the options c("-f", "GPKG", "-overwrite", "-oo", "CONFIG_FILE=", path-to-config-file, "-lco", "GEOMETRY_NAME=geometry", layer) are always appended unless the user explicitly sets different default parameters for the arguments -f, -oo, -lco, and layer.

The arguments boundary and boundary_type can be used to set up a spatial filter during the vectortranslate operations (and speed up the process) using an sf or sfc object (POLYGON or MULTIPOLYGON). The default arguments create a rectangular spatial filter which selects all features that intersect the area. Setting boundary_type = "clipsrc" clips the geometries. In both cases, the appropriate options are automatically added to the vectortranslate_options (unless a user explicitly sets different default options). Check Examples in oe_get() and the introductory vignette.

See also the help page of sf::gdal_utils() and ogr2ogr for more examples and extensive documentation on all available options that can be tuned during the vectortranslate process.

Value

Character string representing the path of the .gpkg file.

See Also

oe_get_keys()

Examples

# First we need to match an input zone with a .osm.pbf file
(its_match = oe_match("ITS Leeds"))

# Copy ITS file to tempdir so that the examples do not require internet
# connection. You can skip the next 3 lines (and start directly with
# oe_download()) when running the examples locally.

file.copy(
  from = system.file("its-example.osm.pbf", package = "osmextract"),
  to = file.path(tempdir(), "test_its-example.osm.pbf"),
  overwrite = TRUE
)

# The we can download the .osm.pbf file (if it was not already downloaded)
its_pbf = oe_download(
  file_url = its_match$url,
  file_size = its_match$file_size,
  download_directory = tempdir(),
  provider = "test"
)

# Check that the file was downloaded
list.files(tempdir(), pattern = "pbf|gpkg")

# Convert to gpkg format
its_gpkg = oe_vectortranslate(its_pbf)

# Now there is an extra .gpkg file
list.files(tempdir(), pattern = "pbf|gpkg")

# Check the layers of the .gpkg file
sf::st_layers(its_gpkg, do_count = TRUE)

# Add points layer
its_gpkg = oe_vectortranslate(its_pbf, layer = "points")
sf::st_layers(its_gpkg, do_count = TRUE)

# Add extra tags to the lines layer
names(sf::st_read(its_gpkg, layer = "lines", quiet = TRUE))
its_gpkg = oe_vectortranslate(
  its_pbf,
  extra_tags = c("oneway", "maxspeed")
)
names(sf::st_read(its_gpkg, layer = "lines", quiet = TRUE))

# Adjust vectortranslate options and convert only 10 features
# for the lines layer
oe_vectortranslate(
  its_pbf,
  vectortranslate_options = c("-limit", 10)
)
sf::st_layers(its_gpkg, do_count = TRUE)

# Remove .pbf and .gpkg files in tempdir
oe_clean(tempdir())

ITSLeeds/osmextract documentation built on Nov. 27, 2024, 3:39 a.m.