read_io: Read LPJmL input and output files
In lpjmlkit: Toolkit for Basic LPJmL Handling

read_io

R Documentation

Read LPJmL input and output files

Description

Generic function to read LPJmL input & output files in different formats. Depending on the format, arguments can be automatically detected or have to be passed as individual arguments.

Usage

read_io(
  filename,
  subset = list(),
  band_names = NULL,
  dim_order = c("cell", "time", "band"),
  file_type = NULL,
  version = NULL,
  order = NULL,
  firstyear = NULL,
  nyear = NULL,
  firstcell = NULL,
  ncell = NULL,
  nbands = NULL,
  cellsize_lon = NULL,
  scalar = NULL,
  cellsize_lat = NULL,
  datatype = NULL,
  nstep = NULL,
  timestep = NULL,
  endian = NULL,
  variable = NULL,
  descr = NULL,
  unit = NULL,
  name = NULL,
  silent = FALSE
)

Arguments

`filename`	Mandatory character string giving the file name to read, including its path and extension.
`subset`	Optional list allowing to subset data read from the file along one or several of its dimensions. See details for more information.
`band_names`	Optional vector of character strings providing the band names or `NULL`. Normally determined automatically from the meta file in case of output files using `file_type = "meta"`.
`dim_order`	Order of dimensions in returned LPJmLData object. Must be a character vector containing all of the following in any order: `c("cell", "time", "band")`. Users may select the order most useful to their further data processing.
`file_type`	Optional character string giving the file type. This is normally detected automatically but can be prescribed if automatic detection is incorrect. Valid options: `"raw"`, a binary file without header. `"clm"`, a binary file with header. `"meta"`, a meta information JSON file complementing a raw or clm file.
`version`	Integer indicating the clm file header version, currently supports one of `c(1, 2, 3, 4)`.
`order`	Integer value or character string describing the order of data items in the file (default in input file: 1; in output file: 4). Valid values for LPJmL input/output files are `"cellyear"`/ `1`, `"yearcell"` / `2`, `"cellindex"`/ `3`, and `"cellseq"` / `4`, although only options `1` and `4` are supported by this function.
`firstyear`	Integer providing the first year of data in the file.
`nyear`	Integer providing the number of years of data included in the file. These are not consecutive in case of `timestep > 1`.
`firstcell`	Integer providing the cell index of the first data item. `0` by default.
`ncell`	Integer providing the number of data items per band.
`nbands`	Integer providing the number of bands per time step of data.
`cellsize_lon`	Numeric value providing the longitude cell size in degrees.
`scalar`	Numeric value providing a conversion factor that needs to be applied to raw data when reading it from file to derive final values.
`cellsize_lat`	Numeric value providing the latitude cell size in degrees.
`datatype`	Integer value or character string describing the LPJmL data type stored in the file. Supported options: `"byte"` / `0`, `"short"` / `1`, `"int"` / `2`, `"float"` / `3`, or `"double"` / `4`.
`nstep`	Integer value defining the number of within-year time steps of the file. Valid values are `1` (yearly), `12` (monthly), `365` (daily). Defaults to `1` if not read from file ("clm" or "meta" file) or provided by the user.
`timestep`	Integer value providing the interval in years between years represented in the file data. Normally `1`, but LPJmL also allows averaging annual outputs over several years. Defaults to `1` if not read from file ("clm" or "meta" file) or provided by user.
`endian`	Endianness to use for file (either `"big"` or `"little"`). By default uses endianness determined from file header or set in meta information or the platform-specific endianness `.Platform$endian` if not set.
`variable`	Optional character string providing the name of the variable contained in the file. Included in some JSON meta files. Important: If `file_type == "raw"`, prescribe `variable = "grid"` to ensure that data are recognized as a grid.
`descr`	Optional character string providing a more detailed description of the variable contained in the file. Included in some JSON meta files.
`unit`	Optional character string providing the unit of the data in the file. Included in some JSON meta files.
`name`	Optional character string specifying the header name. This is usually read from clm headers for `file_type = "clm"` but can be specified for the other `file_type` options.
`silent`	If set to `TRUE`, suppresses most warnings or messages. Use only after testing that `read_io()` works as expected with the files it is being used on. Default: `FALSE`.

Details

The file_type determines which arguments are mandatory or optional. filename must always be provided. file_type is usually detected automatically. Supply only if detected file_type is incorrect.

In case of file_type = "meta", if any of the function arguments not listed as "mandatory" are provided and are already set in the JSON file, a warning is given, but they are still overwritten. Normally, you would only set meta attributes not set in the JSON file.

In case of file_type = "clm", function arguments not listed as "optional" are usually determined automatically from the file header included in the clm file. Users may still provide any of these arguments to overwrite values read from the file header, e.g. when they know that the values in the file header are wrong. Also, clm headers with versions < 4 do not contain all header attributes, with missing attributes filled with default values that may not be correct for all files.

In case of file_type = "raw", files do not contain any information about their structure. Users should provide all arguments not listed as "optional". Otherwise, default values valid for LPJmL standard outputs are used for arguments not supplied by the user. For example, the default firstyear is 1901, the default for nyear, nbands, nstep, and timestep is 1.

subset can be a list containing one or several named elements. Allowed names are "band", "cell", and "year".

"year" can be used to return data for a subset of one or several years included in the file. Integer indices can be between 1 and nyear. If subsetting by actual calendar years (starting at firstyear) a character vector has to be supplied.
"band" can be used to return data for a subset of one or several bands included in the file. These can be specified either as integer indices or as a character vector if bands are named.
"cell" can be used to return data for a subset of cells. Note that integer indices start counting at 1, whereas character indices start counting at the value of firstcell (usually 0).

Value

An LPJmLData object.

Examples

## Not run: 
# First case: meta file. Reads meta information from "my_file.json" and
# data from binary file linked in "my_file.json". Normally does not require
# any additional arguments.
my_data <- read_io("my_file.json")

# Suppose that file data has two bands named "wheat" and "rice". `band_names`
# are included in the JSON meta file. Select only the "wheat" band during
# reading and discard the "rice" band. Also, read only data for years
# 1910-1920.
my_data_wheat <- read_io(
  "my_file.json",
  subset = list(band = "wheat", year = as.character(seq(1910, 1920)))
)

# Read data from clm file. This includes a header describing the file
# structure.
my_data_clm <- read_io("my_file.clm")

# Suppose that "my_file.clm" has two bands containing data for "wheat" and
# "rice". Assign names to them manually since the header does not include a
# `band_names` attribute.
my_data_clm <- read_io("my_file.clm", band_names = c("wheat", "rice"))

# Once `band_names` are set, subsetting by name is possible also for
# file_type = "clm"
my_data_wheat <- read_io(
  "my_file.clm",
  band_names = c("wheat", "rice"),
  subset = list(band = "wheat", year = as.character(seq(1910, 1920)))
)

# Read data from raw binary file. All information about file structure needs
# to be supplied. Use default values except for nyear (1 by default), and
# nbands (also 1 by default).
my_data <- read_io("my_file.bin", nyear = 100, nbands = 2)

# Supply band_names to be able to subset by name
my_data_wheat <- read_io(
  "my_file.bin",
  band_names = c("wheat", "rice"), # length needs to correspond to `nbands`
  subset = list(band = "wheat", year = as.character(seq(1910, 1920))),
  nyear = 100,
  nbands = 2,
)

## End(Not run)

lpjmlkit documentation built on March 31, 2023, 9:35 p.m.