ShapeToMask: Convert Shapefile to Mask Array

View source: R/ShapeToMask.R

ShapeToMaskR Documentation

Convert Shapefile to Mask Array

Description

This function reads a shapefile (.shp) containing information about polygonal regions. It then transfers the shapefile data into an array and subsets the output based on requested region names or IDs. The accepted shapefile databases are 'NUTS', 'LAU', and 'GADM', each with its own unique format. However, the function can use other shapefiles databases with specifying the categories names with the parameter 'id_shape_col'.

Usage

ShapeToMask(
  shp_file,
  ref_grid,
  compute_area_coverage = FALSE,
  shp_system = "NUTS",
  reg_names = NULL,
  reg_ids = NULL,
  reg_level = 3,
  lat_dim = NULL,
  lon_dim = NULL,
  region = FALSE,
  check_valid = FALSE,
  find_min_dist = FALSE,
  max_dist = 50,
  ncores = NULL,
  fileout = NULL,
  units = "degrees",
  id_shape_col = NULL,
  name_shape_col = NULL,
  ...
)

Arguments

shp_file

A character string indicating the shp file path.

ref_grid

A character string indicating the path to the reference data. Either (1) a netCDF file or (2) a list of lon and lat to provide the reference grid points. It is NULL by default.

compute_area_coverage

A logical value indicating the method to find the intersection of the reference grid and the shapefile. When it is TRUE, the method used is the calculation of the area coverage fraction of intersection. If it is FALSE, the method used is searching if the centroid of the grid cell falls inside the shapefile or not. It is FALSE by default.

shp_system

A character string containing the Shapefile System Database Name used to subset the shapefile into regions by using parameters 'reg_ids' or 'reg_names'. The accepted systems are: 'NUTS', 'LAU', and 'GADM'. When it is used, you must specify either 'reg_ids' or 'reg_names'; if you don't need to subset different regions, set it to NULL. It is set to "NUTS" by default (optional).

reg_names

A named list of character string vectors indicating the country and the region name. The name of the list stands for the country name code and the vector character strings indicate the region name for each country. It is NULL by default (optional).

reg_ids

A character string indicating the unique ID in shapefile. It is NULL by default (optional).

reg_level

An integer number from 1 to 3 indicating the 'NUTS' dataset level. For other datasets this parameter is not used. One mask can only have a unique level. It is set to 3 by default (optional).

lat_dim

A character string indicating the latitudinal dimension. If it is NULL, the latitudinal name will be searched using an internal function with the following possible names: 'lat', 'latitude', 'y', 'j' and 'nav_lat'. It is set to NULL by default.

lon_dim

A character string indicating the longitudinal dimension. If it is NULL, the longitudinal name will be searched using an internal function with the following possible names: 'lon', 'longitude', 'x', 'i' and 'nav_lon'. It is set to NULL by default.

region

A logical value indicating if we want a dimension for the regions in the resulting mask array. It is FALSE by default.

check_valid

A logical value that when it is TRUE it uses the function 'sf::st_make_valid' applied to the shapefile and to the coordinates.

find_min_dist

A logical value indicating if we want to look for the nearest coordinate between the shapefile region and the reference grid when there is no intersection between the shapefile and the reference grid. It is FALSE by default.

max_dist

A numeric value indicating the maximum distance is accepted to the closest gridpoint when there is no intersection between the shapefile and the reference grid.

ncores

The number of parallel processes to spawn for the use for parallel computation in multiple cores.

fileout

A character string of the path to save the NetCDF mask. If not specified (default), only the mask array will be returned.

units

A character string indicating if your GIS files has a grid in degrees or meters. If it is NULL, the units will be set as "meters" with the following possible names: 'degrees', 'meters'

id_shape_col

A character string indicating in the shape file which is the name of the column with the values of the IDs of the different polygons. It is NULL by default.

name_shape_col

A character string indicating in the shape file which is the name of the column with the values of the names of the different polygons. It is NULL by default.

...

Arguments passed on to 's2_options' in function 'st_intersection'. See 's2 package'.

Details

To ensure accurate comparison with the shapefile, the function loads a reference dataset that provides longitude and latitude information. By intersecting each subset of the shapefile with the reference coordinates, the function selects only the desired regions. The final step involves creating a mask array. Depending on the chosen option, the mask array is either returned as the function's output or saved into a NetCDF format in the specified directory.

Note: Modules GDAL, PROJ and GEOS are required.

Value

A multidimensional array containing a mask array with longitude and latitude dimensions. If 'region' is TRUE, there will be a dimension for the region.

Examples

## Not run: 
# Example using an external shapefile not distributed with the package
shp_file <- paste0('/esarchive/shapefiles/NUTS3/NUTS_RG_60M_2021_4326.shp/', 
                   'NUTS_RG_60M_2021_4326.shp')
ref_grid <- list(lon = seq(10, 40, 0.5), lat = seq(40, 85, 0.5))
NUTS_name <- list(FI = c('Lappi', 'Kainuu'), SI = c('Pomurska', 'Podravska'))
mask <- ShapeToMask(shp_file = shp_file, ref_grid = ref_grid, 
                     reg_names = NUTS_name)

## End(Not run)

esviz documentation built on Feb. 4, 2026, 5:13 p.m.

Related to ShapeToMask in esviz...