knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/mregions2-",
  out.width = "100%",
  cache = TRUE
)

options(rdf_print_format = "turtle")
options(rmarkdown.html_vignette.check_title = FALSE)

mregions2 offers a streamlined interface to access data from Marine Regions in R for researchers, marine scientists, and geospatial analysts seeking marine geographical information. Marine Regions is part of the LifeWatch project. Its activities focus on two main outputs:

mregions2 allows to retrieve marine geospatial information from the Marine Regions Gazetteer and the Marine Regions Data Products in R.

The Marine Regions Gazetteer is a standard list of marine georeferenced place names.

Gazetteer: a dictionary of geographical names.

https://www.thefreedictionary.com/gazetteer

The entries in the Marine Regions Gazetteer are identified with an unique and persistent identifier, named the ?MRGID (Marine Regions Gazetteer Unique Identifier)

Syntax http://marineregions.org/mrgid/<number>

E.g. https://marineregions.org/mrgid/3293

In addition to the Marine Regions Gazetteer, the Marine Regions Team creates and hosts geographical Data Products, being the most popular one the Marine Regions Maritime Boundaries.

The mregions2 package aims to difference these two domains as clearly as possible, starting with the naming of its functions:

This naming is not trivial, as mregions2 uses different methods to access the Marine Regions Gazetteer than the Marine Regions Data Products: A RESTful API for the Marine Regions Gazetteer, and OGC Web Services to get the Marine Regions Data Products. These are hosted at the Flanders Marine Institute (VLIZ) geoserver:

http://geo.vliz.be/

The data products can be visualized via Web Map Services (WMS) with mrp_view() or downloaded by Web Feature Services (WFS) with mrp_get().

This package is developed for scientific, educational and research purposes. It is not meant to be used for legal, economical (in the sense of exploration of natural resources) or navigational purposes. See the Marine Regions disclaimer.

Wasn't there a package to read Marine Regions already? mregions?

mregions2 supersedes mregions. The need of mregions2 is further explained in the article: Why mregions and mregions2?

NOTE

The naming mrp_*() instead of mr_() was decided to avoid overlapping with the functions from mregions

Marine Regions Gazetteer

library(mregions2)

# To use the pipe operator `%>%`
library(magrittr) 

# For illustrative purposes
library(sf) 
library(dplyr) 

Search by free text

You can look up for marine places the Marine Regions Gazetteer with a free text search. Some examples:

Find any term with the words 'North Sea'

gaz_search("North Sea")

Search names in different languages (provide ISO 2c code)

gaz_search("Noordzee", language = "nl")

Restrict your search to only exact matches

gaz_search("Noordzee", language = "nl", like = FALSE, fuzzy = FALSE)

Search by MRGID

Pass a ?MRGID to gaz_search() to get only that record. E.g. in the previous example, the Belgian Exclusive Economic Zone has the ?MRGID 3293:

gaz_search(3293)

Search by longitude and latitude

Reverse geocoding is possible with mregions2: pass geographical coordinates in the WGS84 projection will return all records that intersect with the point.

e.g. the coordinates longitude 2.927 - latitude 51.21551 lay on the city of Ostend at the Belgian coast

gaz_search(x = 2.927, y = 51.21551)

Passing an sfg geometry object from sf::st_point() is also allowed:

pt <- st_point(c(x = 2.927, y = 51.21551))
pt

gaz_search(pt)

Search by place type

The records of the Marine Regions Gazetteer have place types assigned, either physical such as sea mounts or banks, or administrative like Territorial Sea or Exclusive Economic Zone.

You can find the full list, descriptions and its identifier in the database with gaz_types():

gaz_types()

You can restrict your search to a certain type ID in the argument typeid

gaz_search("Oostende", typeid = 1)

Or you can look up all the records with a certain place type with gaz_search_by_type()

# With text
gaz_search_by_type("Town")

# With the place type ID
gaz_search_by_type(1)

Search by source document

The entries in the Marine Regions Gazetteer are always based on a source. This is either a document, a web site or any sort of authority.

The list of sources is available with gaz_sources()

# List sources
gaz_sources()

Search by source is possible with the function gaz_search_by_source(), either with the source as text or passing a sourceID

# With text
gaz_search_by_source("Flanders Marine Institute (VLIZ)")

# With source ID
gaz_search_by_source(695) 

Search geometries: Geocoding

Geocoding marine places is possible with mregions2. As a bare minimum, the entries of the Marine Regions Gazetteer should always have a the centroid of the feature, in the fields latitude and longitude. The bounding box of the feature may also be available via the fields minLatitude, maxLatitude, minLongitude and maxLongitude.

Moreover, many of the entries of the Marine Regions Gazetteer have actual geometries associated. These are typically Polygons or Linestrings. To avoid overloading the server, the geometries are not returned by default with gaz_search(). Adding the geometries is possible with gaz_geometry(): passing the result of gaz_search() will turn the data frame into a sf::sf object with geometry in the field the_geom, of class sf::sfc.

gaz_search(3293) %>% gaz_geometry()

This is possible cause the result of gaz_search is a data frame including the class mr_df, unique of the mregions2 R package. The geometry function has a S3 method defined for this class: gaz_geometry.mr_df().

You can also use the argument with_geometry as gaz_search(3293, with_geometry = TRUE) to load the geometry directly.

In addition, fetching only the geometries is possible using an ?MRGID on gaz_geometry()

# By MRGID
gaz_geometry(3293)

By default, this is a simple feature geometry list column, an object of class sf::sfc. But other formats are possible:

As Well-Known Text (WKT):

gaz_geometry(3293, format = "wkt")

As a rdflib::rdf object:

gaz_geometry(3293, format = "rdf")

As a sf::sf object:

gaz_geometry(3293, format = "sf")

Learn more about using the mregions2 to access the Marine Regions Gazetteer as RDF article.

Search by relations

The Marine Regions Gazetteer is organized hierarchically: All the records got relations with each other. You can either pass the result of gaz_search() to gaz_relations() or an ?MRGID:

# With gaz_search()
gaz_search(3293) %>% gaz_relations()

# With MRGID
gaz_relations(3293)

The relations must be one of c("partof", "partlypartof", "adjacentto", "similarto", "administrativepartof", "influencedby") or "all" to get all related records (default). You can also decide if you want records that are upper or lower on the hierarchy of the records, or both (default)

For instance, the Belgian Exclusive Economic Zone with ?MRGID 3293 is part of the North Sea and part of Belgium.

gaz_relations(3293, direction = "upper", type = "partof")

RESTful services

There is a set of functions in mregions2 named as gaz_rest that read the Marine Regions Gazetteer REST API via HTTP requests. These are closer to the definition of the web services. All the gazetteer functions shown above, such as gaz_search() or gaz_relations(), make use of this set of functions.

| Main function | REST function | REST web service | |---------------------|---------------------------|-------------------------| | gaz_search() | gaz_rest_records_by_name() | getGazetteerRecordsByName | | gaz_search() | gaz_rest_records_by_names() | getGazetteerRecordsByNames | | gaz_search() | gaz_rest_record_by_mrgid() | getGazetteerRecordByMRGID | | gaz_search() | gaz_rest_records_by_lat_long() | getGazetteerRecordsByLatLong | | gaz_search_by_type() | gaz_rest_records_by_type() | getGazetteerRecordsByType | | gaz_search_by_source() | gaz_rest_records_by_source() | getGazetteerRecordsBySource | | gaz_search_by_source() | gaz_rest_source_by_sourceid() | getGazetteerSourceBySourceID | | gaz_relations() | gaz_rest_relations_by_mrgid() | getGazetteerRelationsByMRGID | | gaz_types() | gaz_rest_types() | getGazetteerTypes | | gaz_sources() | gaz_rest_sources() | getGazetteerSources | | gaz_geometry() | gaz_rest_geometries() | getGazetteerGeometries | | NA | gaz_rest_names_by_mrgid() | getGazetteerNamesByMRGID | | NA | gaz_rest_wmses() | getGazetteerWMSes | | NA | NA | getFeed |

Linked Open Data

You can get a gazetteer item or their geometries as Linked Open Data, in the form of an object of class rdflib::rdf:

gaz_search(3293, rdf = TRUE)

gaz_geometry(3293, format = "rdf")

See the Marine Regions Gazetteer as RDF article for more information.

Marine Regions Data Products

The main function to load a Marine Regions Data Product into R is mrp_get(). To select a data product, you have to pass the layer argument.

# See the full list of data products
mrp_list

# Get the Extended Continental Shelves
mrp_get("ecs")

However, this is an expensive request that may take some time to complete. mregions2 aims to tackle this with three different strategies:

WMS viewer

Instead of downloading the full product directly, it is a better approach to first have a quick look. mrp_view() renders an interactive visualization:

mrp_view("ecs")

It returns a leaflet::leaflet interactive map with the data product requested added via Web Map Services (WMS).

Note leaflet::leaflet and leaflet.extras2::leaflet.extras2 are required by mrp_view() but listed only as "Suggests". Install these packages to use mrp_view().

Caching

If you request again a layer that you requested before, you will see a message:

# One more time! Get the Extended Continental Shelves
ecs <- mrp_get("ecs")

mrp_get() performs a WFS request to get the layer as a zipped Shapefile and writes the response to disk. This saves both bandwith and memory. After downloading, the file is unzipped and read with sf::st_read().

By default, the caching directory is a temporal directory base::tempdir(). To override this behaviour, you can pass a directory to the path argument:

ecs <- mrp_get("ecs", path = "path/to/data")

Or you can set a default directory with options("mregions2.download_path"):

options("mregions2.download_path" = "path/to/data")
ecs <- mrp_get("ecs")

The cache is considered fresh during one month. After that, the requests will be performed again. The Marine Regions products are typically updated every 2 years, so it is unlikely you will be looking at an old product. In case of doubt you can just delete the layers downloaded in the cache path.

OGC and CQL filtering

In some cases, you may be interested in retrieving only a certain amount of records or filtering out some features. You could download the product with mrp_get() and then apply a filter with R base or with the Tidyverse (tidyverse::tidyverse). But getting the full product is slow:

ecs <- mrp_get("ecs") %>% filter(sovereign1 == "Portugal")

Instead, you can write a filter that is applied on the server side:

ecs <- mrp_get("ecs", cql_filter = "sovereign1 = 'Portugal'")

This filter is written in Contextual Query Language (CQL). This is a relatively simple syntax close to SQL.

Writing these filters can be challenging without knowing in advance the names of the columns in a data product, or the unique values that these columns have. If you have to retrieve the full product to get to know the columns and their values then, well, the filter misses a bit its point.

mregions2 has two helpers that help on this mission: mrp_colnames() gets a data frame with the columns and data type of a data product. mrp_col_unique() (or mrp_col_distinct(), they are synonyms) returns the unique values of a column in a data product.

# Columns of the Extended Continental Shelves data product
mrp_colnames("ecs")

# Unique values of the column pol_type
mrp_col_unique("ecs", "pol_type")

# Get all the Overlapping claims
mrp_get("ecs", cql_filter = "pol_type = 'ECS Overlapping claim'")

An exhaustive ontology defining the attributes of each data product is available in the data frame mrp_ontology included in this package, or read the article online for further reasoning.

mrp_ontology

There are many types of filter allowed via standard OGC or CQL filters. mrp_get() and mrp_view() use OGC WFS and WMS respectively. You can pass filter to both of them using the same syntax. You can also limit the number of features requested with the count parameter.

# Get only one feature
mrp_get("ecs", count = 1)

mrp_view() uses leaflet.extras2::addWMS(), which enables passing the filters together with the base URL in the argument baseURL.

# View all Joint submissions, recommendations of deposit to DOALOS with an area larger than 1M square kilometers
verbose <- "
  pol_type IN (
    'ECS Joint CLCS Recommendation', 
    'ECS Joint CLCS Submission', 
    'ECS Joint DOALOS Deposit'
  ) 
  AND area_km2 > 1000000
"
mrp_view("ecs", cql_filter = verbose)

You can also apply a standard OGC filter specification. These are however more verbose:

# View the Extended Continental Shelf of Portugal
verbose <- "
  <Filter>
    <PropertyIsEqualTo>
        <PropertyName>sovereign1</PropertyName>
        <Literal>Portugal</Literal>
    </PropertyIsEqualTo>
  </Filter>
"
mrp_view("ecs", filter = verbose)

Note that all the data products of Marine Regions are served via the VLIZ geoserver. Geoserver implements a more powerful extension of CQL called ECQL (Extended CQL). which allows expressing the full range of filters that OGC Filter 1.1 can encode. ECQL is accepted in many places in GeoServer.

Whenever the documentation refers to CQL, ECQL syntax can be used as well.

I recommend the vignette("ecql_filtering", package = "emodnet.wfs") of the emodnet.wfs R package to know more. The principles explained there are useful also for mregions2.



lifewatch/mregions2 documentation built on April 17, 2025, 10:40 a.m.