docs/netcdf_for_water_forecasting.md

NetCDF for Water Forecasting Conventions v2.0

Foreword

The present document at the efts repository is a possibly temporary copy for convenient reference. The point of truth is at this location which is not yet public as of March 2018.

Credits for the original document go to James Bennett (CSIRO)

Purpose

Plain text files are not well suited to storing the large volumes of data generated for and by ensemble streamflow forecasts with numerical weather prediction models. netCDF is a binary file format developed primarily for climate, ocean and meteorological data. Detailed, formalised descriptions of the data (metadata) can be included inside the netCDF file, and netCDF can store highly compressed data, making the format suitable for the STF project. However, netCDF has traditionally been used to store time slices of gridded data, rather than complete time series of point data. This document describes the conventions we have developed for storing complete time series data used in ensemble streamflow forecasting in netCDF.

NetCDF Introduction and Terms

NetCDF is a binary format, which renders it unintelligible to text editors. It also results in a significant decrease in data size, dis-ambiguity in format, platform independence and implementation independence. The implementation independence arises from the usage of standard libraries for reading and writing in NetCDF format. All tools in this project which use netCDF format, including SWIFT, use these libraries.

The netCDF format uses dimensions, and variables, to store data. Data is stored in variables, each variable can be considered as an array, and is independent of each other variable. The data space of these variables is defined by the dimensions. For example, for a gridded rainfall data set, the dimensions may be latitude and longitude, and the variable may be millimetres per day.

Metadata is stored in netCDF format as attributes. Attributes can be defined as global, applying to the whole data set, or defined as specific to a variable. For instance, the origin of a variable (e.g. Rain gauge) may be stored specifically for that variable, whereas the agency responsible for the data set may be stored as a global attribute.

Version

This is the second version of this specification.

Schematic

The netCDF specification has been inspired by the Deltares NETCDF-CF_TIMESERIES structure for compatibilities purposes.

Dimensions

The netCDF files have five required dimensions:

Global Attributes

STF NetCDF files have the following global attributes:

List of Variables

The following abbreviations are used to construct variable names:

Mandatory Variables

The data set requires the following variables (dimensions are in brackets):

Optional Variables

Optional variables (dimensions given in brackets):

Geolocation:

Observations and simulations:

Quality codes:

Description of Variables

Dimensions and attributes of mandatory and optional variables are described below.

time

Description: Time vector (int32)

Dimensions:

  1. time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | time The long name for the variable | long_name | String | time Time units | units | String | hours since 1970-01-01 00:00:00.0 +0000 Time standard from which times are offset | time_standard | String | UTC Axis label | axis | String | t

The units can also be days or months since 1970-01-01. They are in UTC by default.

NB#1: Using units of months requires special treatment. When adding months to a given time, the addition method depends on the day of the month of the time units, as follows:

  1. If the day of the month specified in the time units is less than 24, simply add months. E.g. time units are 'months since 1970-02-15 00:00:00.0 +0000'. Adding one month yields a time stamp of 1970-03-15 00:00:00.0 +0000
  2. If the day of month is greater than or equal to 24, the time stamp is calculated by counting back from the end of a given month. E.g. time units are 'months since 1970-02-26 00:00:00.0 +0000'. Adding one month yields a time stamp of 1970-03-29 00:00:00.0 +0000

NB#2: When data are not forecasts, the first value should indicate over which period the variables are aggregated - i.e., do use values of zero (see Description of time types).

station_id

Description: Station identification number (int32)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | station or node identification code

station_name

Description: Station name (string)

Dimensions:

  1. strLen
  2. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | station or node name

ens_member

Description: Vector of length equal to 1:no. of ensemble members. Vector has a minimum length of 1. (int32)

Dimensions:

  1. ens_member

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | ens_member The long name for the variable | long_name | String | ensemble member Units | units | String | member id Axis label | axis | String | u

lead_time

Description: Vector giving time since a forecast was issued. If the variable is not a forecast, this vector can have length of zero (int32)

Dimensions:

  1. lead_time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | lead time The long name for the variable | long_name | String | forecast lead time Units | units | String | hours since time Axis label | axis | String | u

The units can also be days or months since time of forecast.

lat

Description: Vector of latitudes of stations in decimal degrees (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | latitude Units | units | String | degrees_north Axis label | axis | String | y

lon

Description: Vector of longitudes of stations in decimal degrees (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | longitude Units | units | String | degrees_east Axis label | axis | String | x

y

Description: Position vector in projected coordinates (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | northing_GDA94_zone55 The long name for the variable | long_name | String | northing from the GDA94 datum in MGA Zone 55 Axis label | axis | String | y

x

Description: Position vector in projected coordinates (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | easting_GDA94_zone55 The long name for the variable | long_name | String | easting from the GDA94 datum in MGA Zone 55 Axis label | axis | String | x

area

Description: Area over which non-point data apply (e.g. subcatchment area) (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | area The long name for the variable | long_name | String | station area Units | units | String | sqm

elevation

Description: Elevation of station (single)

Dimensions:

  1. station

Attributes

Description | Name | Type | Example --- | --- | --- | --- The short name for the variable | standard_name | String | elevation The long name for the variable | long_name | String | station elevation above sea level Units | units | String | m

pet_obs/rain_obs/q_obs/swe_obs/tmin_obs/tmax_obs/tave_obs

Description: Observed data (double)

pet = potential evapotranspiration ; rain = precipitation; q = streamflow; swe = snow water equivalent, tmin = minimum surface air temperature; tmax = maximum surface air temperature

Example: q_obs

Dimensions:

  1. lead_time
  2. station
  3. ens_member
  4. time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | observed rainfall Units | units | String | mm Missing data value | _FillValue | float | -9999f Type of aggregation | type | int | 2 Description of type of aggregation. | type_description | String | accumulated over the preceding interval Type of data. Code as follows: "obs" - observed directly; "der" - derived from observations | dat_type | string | der Description of type of data | dat_type_description | string | AWAP data interpolated from observations Location type of data. Takes value of "Point" (e.g. for a rain gauge) or "Area" (e.g. for a subarea). Default value is "Point". | location_type | String | Point

pet_sim/rain_sim/q_sim/swe_sim/tmin_sim/tmax_sim/tave_sim

Description: Simulated data (double)

pet = potential evapotranspiration ; rain = precipitation; q = streamflow; swe = snow water equivalent, Tmin = minimum surface air temperature; Tmax = maximum surface air temperature

Dimensions:

  1. lead_time
  2. station
  3. ens_member
  4. time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | simulated rainfall Units | units | String | m3/s Missing data value | _FillValue | float | -9999f Type of aggregation | type | int | 3 Description of type of aggregation. | type_description | String | averaged over the preceding interval Type of data. Code as follows: "sim" - simulated from historical forcings; "fct" - forecast | dat_type | string | fct Description of type of data | dat_type_description | string | forecast data Location type of data. Takes value of "Point" (e.g. for a rain gauge) or "Area" (e.g. for a subarea). Default value is "Point". | location_type | String | Point

[variable]_obs_qul/[variable]_sim_qul

Description: Data quality

Dimensions:

  1. lead_time
  2. station
  3. ens_member
  4. time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | Quality of observed rainfall Quality code standard | units | String | ABC Quality coding Missing data value | _FillValue | int | -1

sv1/sv2/sv[#]

Description: State variables (double)

Dimensions:

  1. lead_time
  2. station
  3. ens_member
  4. time

Attributes

Description | Name | Type | Example --- | --- | --- | --- The long name for the variable | long_name | String | state var 1 Name of model | model_name | String | GR4H_RR Name of state variable in model | sv_name | String | UH_Inflow Description of state variable | sv_description | String | Total inflow to Unit Hydrographs in GR4H Missing data value | _FillValue | float | -9999f

Description of time types

Type ID | Description | Example variable --- | --- | --- 1 | instantaneous data | stage height 2 | accumulated over the preceding interval | rainfall 3 | averaged over the preceding interval | flow, average temp 4 | accumulated since start of forecast | flow 5 | point value recorded in the preceding interval | max/min temperature 11 | climatology data - instantaneous data | climatology stage height 12 | climatology data - accumulated over the preceding interval | climatology rainfall 13 | climatology data - averaged over the preceding interval | climatology flow 14 | climatology data - accumulated since start of forecast | climatology flow 15* | climatology data - point value recorded in the preceding interval | climatology max temp

*NB - please specify the period over which climatology data is calculated and how it is calculated in the global "comment" attribute, as well as any applicable references in the "source" global attribute.

Description of data types

Type ID | Description | Example variable --- | --- | --- obs | observed directly | gauged rainfall der | derived from observations | awap rainfall sim | simulated from observations | flow simulated by GR4H forced by observations fct | simulated from forecasts | flow forecast by GR4H forced by NWP forecasts



jmp75/efts documentation built on Feb. 3, 2023, 2:44 p.m.