var.get.nc: Read Data from a NetCDF Variable

View source: R/RNetCDF.R

var.get.ncR Documentation

Read Data from a NetCDF Variable

Description

Read the contents of a NetCDF variable.

Usage

var.get.nc(ncfile, variable, start=NA, count=NA,
  na.mode=4, collapse=TRUE, unpack=FALSE, rawchar=FALSE, fitnum=FALSE,
  cache_bytes=NA, cache_slots=NA, cache_preemption=NA)

Arguments

Arguments marked "netcdf4" are optional for datasets in that format and ignored for other formats.

ncfile

Object of class NetCDF which points to the NetCDF dataset (as returned from open.nc).

variable

ID or name of the NetCDF variable.

start

A vector of indices specifying the element where reading starts along each dimension of variable. Indices are numbered from 1 onwards, and the order of dimensions is shown by print.nc (array elements are stored sequentially with leftmost indices varying fastest). By default (start=NA), all dimensions of variable are read from the first element onwards. Otherwise, start must be a vector whose length is not less than the number of dimensions in variable (excess elements are ignored). Any NA values in vector start are set to 1.

count

A vector of integers specifying the number of values to read along each dimension of variable. The order of dimensions is the same as for start. By default (count=NA), all dimensions of variable are read from start to end. Otherwise, count must be a vector whose length is not less than the number of dimensions in variable (excess elements are ignored). Any NA value in vector count indicates that the corresponding dimension should be read from the start index to the end of the dimension.

na.mode

Missing values in the NetCDF dataset are converted to NA values in the result returned to R. The missing values are defined by attributes of the NetCDF variable, which are selected by the following modes:

mode data type attribute(s)
0 numeric _FillValue, then missing_value
1 numeric _FillValue only
2 numeric missing_value only
3 any no conversion
4 numeric valid_range, valid_min, valid_max, _FillValue
5 any same as mode 4 for numeric types;
_FillValue for other types

For explanation of attribute conventions used by mode 4, please see: https://docs.unidata.ucar.edu/nug/current/attribute_conventions.html

collapse

TRUE if degenerated dimensions (length=1) should be omitted.

unpack

Packed variables are unpacked if unpack=TRUE and the attributes add_offset and/or scale_factor are defined. Default is FALSE.

rawchar

This option only relates to NetCDF variables of type NC_CHAR. When rawchar is FALSE (default), a NetCDF variable of type NC_CHAR is converted to a character array in R. The character values are from the fastest-varying dimension of the NetCDF variable, so that the R character array has one fewer dimensions than the NC_CHAR array. If rawchar is TRUE, the bytes of NC_CHAR data are read into an R raw array of the same shape.

fitnum

By default, all numeric variables are read into R as double precision values. When fitnum==TRUE, the smallest R numeric type that can exactly represent each external type is used, as follows:

NC_BYTE integer
NC_UBYTE integer
NC_SHORT integer
NC_USHORT integer
NC_INT integer
NC_UINT double
NC_FLOAT double
NC_DOUBLE double
NC_INT64 integer64
NC_UINT64 integer64
cache_bytes

("netcdf4") Size of chunk cache in bytes. Value of NA (default) implies no change.

cache_slots

("netcdf4") Number of slots in chunk cache. Value of NA (default) implies no change.

cache_preemption

("netcdf4") Value between 0 and 1 (inclusive) that biases the cache scheme towards eviction of chunks that have been fully read. Value of NA (default) implies no change.

Details

NetCDF numeric variables cannot portably represent NA values from R. NetCDF does allow attributes to be defined for variables, and several conventions exist for attributes that define missing values and valid ranges. The convention in use can be specified by argument na.mode. Values of a NetCDF variable that are deemed to be missing are automatically converted to NA in the results returned to R. Unusual cases can be handled directly in user code by setting na.mode=3.

To reduce the storage space required by a NetCDF file, numeric variables are sometimes packed into types of lower precision. The original data can be recovered (approximately) by multiplication of the stored values by attribute scale_factor followed by addition of attribute add_offset. This unpacking operation is performed automatically for variables with attributes scale_factor and/or add_offset if argument unpack is set to TRUE. If unpack is FALSE, values are read from each variable without alteration.

Data in a NetCDF variable is represented as a multi-dimensional array. The number and length of dimensions is determined when the variable is created. The start and count arguments of this routine indicate where the reading starts and the number of values to read along each dimension.

The argument collapse allows to keep degenerated dimensions (if set to FALSE). As default, array dimensions with length=1 are omitted (e.g., an array with dimensions [2,1,3,4] in the NetCDF dataset is returned as [2,3,4]).

Awkwardness arises mainly from one thing: NetCDF data are written with the last dimension varying fastest, whereas R works opposite. Thus, the order of the dimensions according to the CDL conventions (e.g., time, latitude, longitude) is reversed in the R array (e.g., longitude, latitude, time).

Value

An array with dimensions determined by count and a data type that depends on the type of variable. For NetCDF variables of type NC_CHAR, the R type is either character or raw, as specified by argument rawchar. For NC_STRING, the R type is character. Numeric variables are read as double precision by default, but the smallest R type that exactly represents each external type is used if fitnum is TRUE.

Variables of user-defined types are supported. "compound" arrays are read into R as lists, with items named for the compound fields; items of base NetCDF data types are converted to R arrays, with leading dimensions from the field dimensions (if any) and trailing dimensions from the NetCDF variable. "enum" arrays are read into R as factor arrays. "opaque" arrays are read into R as raw (byte) arrays, with a leading dimension for bytes of the opaque type and trailing dimensions from the NetCDF variable. "vlen" arrays are read into R as a list with dimensions of the NetCDF variable; items in the list may have different lengths; base NetCDF data types are converted to R vectors.

The dimension order in the R array is reversed relative to the order reported by NetCDF commands such as ncdump, because NetCDF arrays are stored in row-major (C) order whereas R arrays are stored in column-major (Fortran) order.

Arrays of type character drop the fastest-varying dimension of the corresponding NC_CHAR array, because this dimension corresponds to the length of the individual character elements. For example, an NC_CHAR array with dimensions (5,10) would be returned as a character vector containing 5 elements, each with a maximum length of 10 characters.

The arguments marked for "netcdf4" format refer to the chunk cache used for reading and writing variables. Default cache settings are defined by the NetCDF library, and they can be adjusted for each variable to improve performance in some applications.

Note

NC_BYTE is always interpreted as signed.

Author(s)

Pavel Michna, Milton Woods

References

https://www.unidata.ucar.edu/software/netcdf/

Examples

##  Create a new NetCDF dataset and define two dimensions
file1 <- tempfile("var.get_", fileext=".nc")
nc <- create.nc(file1)

dim.def.nc(nc, "station", 5)
dim.def.nc(nc, "time", unlim=TRUE)
dim.def.nc(nc, "max_string_length", 32)

##  Create three variables, one as coordinate variable
var.def.nc(nc, "time", "NC_INT", "time")
var.def.nc(nc, "temperature", "NC_DOUBLE", c(0,1))
var.def.nc(nc, "name", "NC_CHAR", c("max_string_length", "station"))

##  Put some _FillValue attribute for temperature
att.put.nc(nc, "temperature", "_FillValue", "NC_DOUBLE", -99999.9)

##  Define variable values
mytime        <- c(1:2)
mytemperature <- c(1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, NA, NA, 9.9)
myname        <- c("alfa", "bravo", "charlie", "delta", "echo")

##  Put the data
var.put.nc(nc, "time", mytime, 1, length(mytime))
var.put.nc(nc, "temperature", mytemperature, c(1,1), c(5,2))
var.put.nc(nc, "name", myname, c(1,1), c(32,5))

sync.nc(nc)

##  Get the data (or a subset)
var.get.nc(nc, 0)
var.get.nc(nc, "temperature")
var.get.nc(nc, "temperature", c(3,1), c(1,1))
var.get.nc(nc, "temperature", c(3,2))
var.get.nc(nc, "temperature", c(NA,2), c(NA,1))
var.get.nc(nc, "name")
var.get.nc(nc, "name", c(1,2), c(4,2))
var.get.nc(nc, "name", c(1,2), c(NA,2))

close.nc(nc)
unlink(file1)

RNetCDF documentation built on May 29, 2024, 2:41 a.m.