normGeometry: Normalise geometries

View source: R/normGeometry.R

normGeometryR Documentation

Normalise geometries

Description

Harmonise and integrate geometries into a standardised format

Usage

normGeometry(
  input = NULL,
  pattern = NULL,
  query = NULL,
  thresh = 10,
  outType = "gpkg",
  priority = "ontology",
  beep = NULL,
  simplify = FALSE,
  update = FALSE,
  verbose = FALSE
)

Arguments

input

[character(1)]
path of the file to normalise. If this is left empty, all files at stage two as subset by pattern are chosen.

pattern

[character(1)]
an optional regular expression. Only dataset names which match the regular expression will be processed.

query

[character(1)]
part of the SQL query (starting from WHERE) used to subset the input geometries, for example where NAME_0 = 'France'. The first part of the query (where the layer is defined) is derived from the meta-data of the currently handled geometry.

thresh

[integerish(1)]

outType

[character(1)]
the output file-type, see st_drivers for a list. If a file-type supports layers, they are stored in the same file, otherwise the different layers are provided separately. For an R-based workflow, "rds" could be an efficient option.

priority

[character(1)]
how to match the new geometries with the already harmonised database. This can either be

  • "spatial": where all territories are intersected spatially or

  • "ontology": where territories are matched by comparing their name with the ontology and those that do not match are intersected spatially,

  • "both": where territories are matched with the ontology and spatially, and conflicts are indicated

beep

[integerish(1)]
Number specifying what sound to be played to signal the user that a point of interaction is reached by the program, see beep.

simplify

[logical(1)]
whether or not to simplify geometries.

update

[logical(1)]
whether or not the physical files should be updated (TRUE) or the function should merely return the geometry inventory of the handled files (FALSE, default). This is helpful to check whether the metadata specification and the provided file(s) are properly specified.

verbose

[logical(1)]
be verbose about what is happening (default FALSE). Furthermore, you can use suppressMessages to make this function completely silent.

Details

To normalise geometries, this function proceeds as follows:

  1. Read in input and extract initial metadata from the file name.

  2. In case filters are set, the new geometry is filtered by those.

  3. The territorial names are matched with the gazetteer to harmonise new territorial names (at this step, the function might ask the user to edit the file 'matching.csv' to align new names with already harmonised names).

  4. Loop through every nation potentially included in the file that shall be processed and carry out the following steps:

    • In case the geometries are provided as a list of simple feature POLYGONS, they are dissolved into a single MULTIPOLYGON per main polygon.

    • In case the nation to which a geometry belongs has not yet been created at stage three, the following steps are carried out:

      1. Store the current geometry as basis of the respective level (the user needs to make sure that all following levels of the same dataseries are perfectly nested into those parent territories, for example by using the GADM dataset)

    • In case the nation to which the geometry belongs has already been created, the following steps are carried out:

      1. Check whether the new geometries have the same coordinate reference system as the already existing database and re-project the new geometries if this is not the case.

      2. Check whether all new geometries are already exactly matched spatially and stop if that is the case.

      3. Check whether the new geometries are all within the already defined parents, and save those that are not as a new geometry.

      4. Calculate spatial overlap and distinguish the geometries into those that overlap with more and those with less than thresh.

      5. For all units that did match, copy gazID from the geometries they overlap.

      6. For all units that did not match, rebuild metadata and a new gazID.

    • If update = TRUE, store the processed geometry at stage three.

  5. Move the geometry to the folder '/processed', if it is fully processed.

Value

This function harmonises and integrates so far unprocessed geometries at stage two into stage three of the geospatial database. It produces for each main polygon (e.g. nation) in the registered geometries a spatial file of the specified file-type.

See Also

Other normalise functions: normTable()

Examples

if(dev.interactive()){
  library(sf)

  # build the example database
  makeExampleDB(until = "regGeometry", path = tempdir())

  # normalise all geometries ...
  normGeometry(nation = "estonia", update = TRUE)

  # ... and check the result
  st_layers(paste0(tempdir(), "/adb_geometries/stage3/Estonia.gpkg"))
  output <- st_read(paste0(tempdir(), "/adb_geometries/stage3/Estonia.gpkg"))
}

arealDB documentation built on July 9, 2023, 6:09 p.m.