create_netCDF: Create a netCDF file (with or without data)

View source: R/functions_netCDF.R

create_netCDFR Documentation

Create a netCDF file (with or without data)

Description

The user describes a data array and specifies spatial, vertical, and time information, and metadata to create a netCDF in netcdf-4 format according to CF-1.8 standards.

Usage

create_netCDF(
  filename,
  xyspace,
  data = NULL,
  data_str = c("xyzt", "xyt", "xyz", "xy", "szt", "st", "sz", "s"),
  data_dims = get_data_dims(data_str, dim(data)),
  data_type = c("double", "float", "integer", "short", "byte", "char"),
  var_attributes = list(name = "", standard_name = "", long_name = "", units = "",
    grid_mapping = "crs", cell_methods = "", cell_measures = ""),
  xy_attributes = list(name = c("lon", "lat"), standard_name = c("longitude",
    "latitude"), long_name = c("Longitude", "Latitude"), units = c("degrees_east",
    "degrees_north")),
  crs_attributes = list(crs_wkt = sf::st_crs("OGC:CRS84")$Wkt, grid_mapping_name =
    "latitude_longitude", longitude_of_prime_meridian = 0, semi_major_axis = 6378137,
    inverse_flattening = 298.257223563),
  check_crs = TRUE,
  time_values = NULL,
  type_timeaxis = c("timeseries", "climatology"),
  time_attributes = list(units = "days since 1900-01-01", calendar = "standard", unlim =
    FALSE),
  time_bounds = matrix(NA, nrow = length(time_values), ncol = 2),
  vertical_values = NULL,
  vertical_attributes = list(units = "", positive = "down"),
  vertical_bounds = matrix(NA, nrow = length(vertical_values), ncol = 2),
  global_attributes = list(title = "Title"),
  overwrite = FALSE,
  nc_compression = FALSE,
  nc_shuffle = TRUE,
  nc_deflate = 5,
  nc_chunks = "by_zt",
  verbose = FALSE
)

Arguments

filename

A character string. The name of the netCDF file.

xyspace

An object that describes the xy-space of the netCDF file. If xyspace does not contain a crs, then it is assumed that the crs is crs_attributes[["crs_wkt"]]. If data are gridded, then passed to get_xyspace; if non-gridded, then passed to as_points.

data

A numeric array or vector (optional). A vector is converted to a one-column matrix.

data_str

A character string describing the dimensions of data where "xy" stands for x and y spatial dimensions if the spatial structure is gridded, while "s" stands for site if the spatial structure are discrete points; z stands for a vertical dimension; and t stands for a temporal dimension.

data_dims

A list as returned by get_data_dims. If NULL and data is not missing, then calculated from data.

data_type

A character string. The netCDF data type.

var_attributes

A list of named character strings defining the netCDF variable(s). Elements name and units are required.

xy_attributes

A list of named character strings defining the netCDF dimensions for the xy-space. Elements name, standard_name, long_name, and units are required.

crs_attributes

A list of named character strings defining the netCDF crs of the xy-space. Elements crs_wkt and grid_mapping_name are required.

check_crs

A logical value. If TRUE then check that the crs provided via crs_attributes matches the ones from locations and grid if available.

time_values

A numeric vector or NULL. The values along the time dimension (if present). In units as described by time_attributes.

type_timeaxis

A character string. Describing if the time dimension represents a time series or a climatological time.

time_attributes

A list of named character strings defining the netCDF time dimension. Elements calendar, units, and unlim are required.

time_bounds

A numeric vector or two-dimensional matrix. The start and end of each time (or climatological) unit.

vertical_values

A numeric vector or NULL. The values along the vertical dimension (if present). In units as described by vertical_attributes.

vertical_attributes

A list of named character strings defining the netCDF vertical dimension, e.g., soil depth. Elements units and positive are required.

vertical_bounds

A numeric vector or two-dimensional matrix. The upper/lower limits of each vertical unit.

global_attributes

A list of named character strings defining the global attributes of the netCDF.

overwrite

A logical value. If TRUE, file will be overwritten if it already exists.

nc_compression

A logical value. If TRUE, then the netCDF is created using compression arguments nc_shuffle, nc_deflate, and nc_chunks. Compression is turned off by default.

nc_shuffle

A logical value. If TRUE, then the shuffle filter is turned on which can improve compression. Used only if nc_compression is activated TRUE.

nc_deflate

An integer between 1 and 9 (with increasing) compression or NA to turn off compression. Used only if nc_compression is activated TRUE.

nc_chunks

A character string, NA, or an integer vector. See details. The default "by_zt" is to create chunks for the entire xy-space and for each vertical and each time step. Used only if nc_compression is activated TRUE.

verbose

A logical value.

Value

This function is used for the side-effect of creating a netCDF file. Data values are written to the file if provided as argument data.

Details

Values can be written to the file at a later time using function populate_netCDF.

The created netCDF is suitable for three data situations:

  1. one variable and xy-space, time and vertical dimensions

  2. one variable and xy-space and time or vertical dimensions

  3. one or multiple variables and xy-space dimensions without time/vertical dimensions

Spatial setup

Spatial information about the xy-space is derived from the arguments xyspace, xy_attributes, crs_attributes, data_str, and data_dims.

The xy-space is either gridded (determined by the first two characters of data_str equal to "xy"), or list of discrete points/sites (determined by the first character of data_str equal to "s").

  • The gridded situation creates x and y dimensions and associated variables in the netCDF file. The size of the xy-space must agree with the elements "n_x" and "n_y" of data_dims and, thus, with the two first dimensions of data, if available.

  • The discrete point/site situation creates a site dimension and associated variable as well as x and y variables for the spatial coordinate values of the sites; see CF point-data. The code will add a coordinates attribute to the variable(s) and a featureType = "point" global attribute. The size of the xy-space, i.e., the number of sites, must agree with the element "n_s" of data_dims and, thus, with the first dimension of data, if available.

The crs are checked by default (see argument check_crs) for consistency among crs_atttributes, locations, and/or grid. However, this check may fail when locations and/or grid use a PROJ.4 representation that doesn't compare well with a WKT2 representation provided by crs_atttributes even if they are the same. Turn off these checks in such cases.

Spatial dimensions of data

For the gridded situation, data array must be arranged in "expanded" spatial format, i.e., the two dimensions of the data array span to the xy-space. The first dimension, i.e., X, matches gridcells along longitude or a projected x coordinate and the second dimension, i.e., Y, matches gridcells along latitude or a projected y coordinate. The xy_attributes[["name"]][1:2] defines the names of the x and y dimensions/variables.

However, "gridded" data objects are frequently organized by "collapsed" x and y dimensions, e.g., to achieve a sparse representation. Use the function convert_xyspace to expand sparse data arrays before their use by function create_netCDF.

For the discrete point/site situation, data array must be arranged in "collapsed" spatial format, i.e., the first dimension (rows) of data corresponds to the number of points/sites.

Non-spatial dimensions of data

The first non-spatial dimension of a data array, if present, corresponds to

  • multiple variables, if data_str is "xy" or "s"

  • time dimension, if data_str is "xyt" or "st"

  • vertical dimension, if data_str is "xyzt", "xyz", "szt", or "sz"

The second non-spatial dimension of the data array, if present, corresponds to the time dimension; this situation arises only in the presence of both a time and vertical dimension, i.e., data_str is "xyzt" or "szt".

Variables

Use CMIP6 standard variable names, units, etc., where available. Standardized variable names can be searched in the CMIP6-cmor-tables

Chunking

The argument nc_chunks offers two auto-determined chunking schemes:

"by_zt"

create chunks for the entire xy-space and for each vertical and each time step

"by_t"

create chunks for the entire xy-space and all vertical steps (if present) and for each time step

Alternatively, the user can provide an integer vector with a length equal to the number of the dimensions according to data_dims.

References

CF conventions

See Also

populate_netCDF, read_netCDF

Examples

# Prepare data for examples
tmp_nc <- create_example_netCDFs(tempdir(), c("xyt", "szt"), "timeseries")
data_xyt <- read_netCDF(tmp_nc[["xyt"]], "array", xy_names = c("x", "y"))
data_szt <- read_netCDF(tmp_nc[["szt"]], "array", xy_names = c("x", "y"))

# Prepare attribute lists
nc_att_global <- list(
  title = "Example netCDF of package rSW2st",
  version = paste0("v", format(Sys.Date(), "%Y%m%d")),
  source_id = "SOILWAT2",
  further_info_url = "https://github.com/DrylandEcology/",
  source_type = "LAND",
  realm = "land",
  product = "model-output",
  grid = "native Alberts projection grid with NAD83 datum",
  grid_label = "gn",
  nominal_resolution = "1 m"
)

nc_att_crs <- list(
  crs_wkt = sf::st_crs("EPSG:6350")$Wkt,
  grid_mapping_name = "albers_conical_equal_area",
  standard_parallel = c(29.5, 45.5),
  longitude_of_central_meridian = -96.0,
  latitude_of_projection_origin = 23.0,
  false_easting = 0.0,
  false_northing = 0.0,
  # GRS 1980 ellipsoid
  longitude_of_prime_meridian = 0,
  semi_major_axis = 6378137.0,
  inverse_flattening = 298.257222101
)

nc_att_xy <- list(
  name = c("x", "y"),
  standard_name = c("projection_x_coordinate", "projection_y_coordinate"),
  long_name = c("x coordinate of projection", "y coordinate of projection"),
  units = c("m", "m")
)

## Write netCDF for gridded data
tmp_nc[["xyt2"]] <- sub(".nc", "2.nc", tmp_nc[["xyt"]])

create_netCDF(
  filename = tmp_nc[["xyt2"]],
  xyspace = data_xyt[["xyspace"]],
  data = data_xyt[["data"]],
  data_str = "xyt",
  var_attributes = list(name = "sine", units = "1"),
  xy_attributes = nc_att_xy,
  crs_attributes = nc_att_crs,
  time_values = data_xyt[["time_values"]],
  type_timeaxis = "timeseries",
  global_attributes = nc_att_global
)

data_xyt2 <- read_netCDF(tmp_nc[["xyt2"]], "array", xy_names = c("x", "y"))
all.equal(data_xyt2[["data"]], data_xyt[["data"]])


## Write netCDF for discrete data
tmp_nc[["szt2"]] <- sub(".nc", "2.nc", tmp_nc[["szt"]])

create_netCDF(
  filename = tmp_nc[["szt2"]],
  xyspace = as.data.frame(data_szt[["xyspace"]][1:2]),
  data = data_szt[["data"]],
  data_str = "szt",
  var_attributes = list(name = "sine", units = "1"),
  xy_attributes = nc_att_xy,
  crs_attributes = nc_att_crs,
  time_values = data_szt[["time_values"]],
  type_timeaxis = "timeseries",
  vertical_values = data_szt[["vertical_values"]],
  vertical_attributes = list(units = "m", positive = "down"),
  global_attributes = nc_att_global
)


data_szt2 <- read_netCDF(tmp_nc[["szt2"]], "array", xy_names = c("x", "y"))
all.equal(data_szt2[["data"]], data_szt[["data"]])

# Cleanup
unlink(unlist(tmp_nc))


DrylandEcology/rSW2st documentation built on Jan. 10, 2024, 6:22 p.m.