read_io: Read LPJmL input and output files

View source: R/read_io.R

read_ioR Documentation

Read LPJmL input and output files

Description

Generic function to read LPJmL input & output files in different formats. Depending on the format, arguments can be automatically detected or have to be passed as individual arguments.

Usage

read_io(
  filename,
  subset = list(),
  band_names = NULL,
  dim_order = c("cell", "time", "band"),
  file_type = NULL,
  version = NULL,
  order = NULL,
  firstyear = NULL,
  nyear = NULL,
  firstcell = NULL,
  ncell = NULL,
  nbands = NULL,
  cellsize_lon = NULL,
  scalar = NULL,
  cellsize_lat = NULL,
  datatype = NULL,
  nstep = NULL,
  timestep = NULL,
  endian = NULL,
  variable = NULL,
  descr = NULL,
  unit = NULL,
  name = NULL,
  silent = FALSE
)

Arguments

filename

Mandatory character string giving the file name to read, including its path and extension.

subset

Optional list allowing to subset data read from the file along one or several of its dimensions. See details for more information.

band_names

Optional vector of character strings providing the band names or NULL. Normally determined automatically from the meta file in case of output files using file_type = "meta".

dim_order

Order of dimensions in returned LPJmLData object. Must be a character vector containing all of the following in any order: c("cell", "time", "band"). Users may select the order most useful to their further data processing.

file_type

Optional character string giving the file type. This is normally detected automatically but can be prescribed if automatic detection is incorrect. Valid options:

  • "raw", a binary file without header.

  • "clm", a binary file with header.

  • "meta", a meta information JSON file complementing a raw or clm file.

version

Integer indicating the clm file header version, currently supports one of c(1, 2, 3, 4).

order

Integer value or character string describing the order of data items in the file (default in input file: 1; in output file: 4). Valid values for LPJmL input/output files are "cellyear"/ 1, "yearcell" / 2, "cellindex"/ 3, and "cellseq" / 4, although only options 1 and 4 are supported by this function.

firstyear

Integer providing the first year of data in the file.

nyear

Integer providing the number of years of data included in the file. These are not consecutive in case of timestep > 1.

firstcell

Integer providing the cell index of the first data item. 0 by default.

ncell

Integer providing the number of data items per band.

nbands

Integer providing the number of bands per time step of data.

cellsize_lon

Numeric value providing the longitude cell size in degrees.

scalar

Numeric value providing a conversion factor that needs to be applied to raw data when reading it from file to derive final values.

cellsize_lat

Numeric value providing the latitude cell size in degrees.

datatype

Integer value or character string describing the LPJmL data type stored in the file. Supported options: "byte" / 0, "short" / 1, "int" / 2, "float" / 3, or "double" / 4.

nstep

Integer value defining the number of within-year time steps of the file. Valid values are 1 (yearly), 12 (monthly), 365 (daily). Defaults to 1 if not read from file ("clm" or "meta" file) or provided by the user.

timestep

Integer value providing the interval in years between years represented in the file data. Normally 1, but LPJmL also allows averaging annual outputs over several years. Defaults to 1 if not read from file ("clm" or "meta" file) or provided by user.

endian

Endianness to use for file (either "big" or "little"). By default uses endianness determined from file header or set in meta information or the platform-specific endianness .Platform$endian if not set.

variable

Optional character string providing the name of the variable contained in the file. Included in some JSON meta files. Important: If file_type == "raw", prescribe variable = "grid" to ensure that data are recognized as a grid.

descr

Optional character string providing a more detailed description of the variable contained in the file. Included in some JSON meta files.

unit

Optional character string providing the unit of the data in the file. Included in some JSON meta files.

name

Optional character string specifying the header name. This is usually read from clm headers for file_type = "clm" but can be specified for the other file_type options.

silent

If set to TRUE, suppresses most warnings or messages. Use only after testing that read_io() works as expected with the files it is being used on. Default: FALSE.

Details

The file_type determines which arguments are mandatory or optional. filename must always be provided. file_type is usually detected automatically. Supply only if detected file_type is incorrect.

In case of file_type = "meta", if any of the function arguments not listed as "mandatory" are provided and are already set in the JSON file, a warning is given, but they are still overwritten. Normally, you would only set meta attributes not set in the JSON file.

In case of file_type = "clm", function arguments not listed as "optional" are usually determined automatically from the file header included in the clm file. Users may still provide any of these arguments to overwrite values read from the file header, e.g. when they know that the values in the file header are wrong. Also, clm headers with versions < 4 do not contain all header attributes, with missing attributes filled with default values that may not be correct for all files.

In case of file_type = "raw", files do not contain any information about their structure. Users should provide all arguments not listed as "optional". Otherwise, default values valid for LPJmL standard outputs are used for arguments not supplied by the user. For example, the default firstyear is 1901, the default for nyear, nbands, nstep, and timestep is 1.

subset can be a list containing one or several named elements. Allowed names are "band", "cell", and "year".

  • "year" can be used to return data for a subset of one or several years included in the file. Integer indices can be between 1 and nyear. If subsetting by actual calendar years (starting at firstyear) a character vector has to be supplied.

  • "band" can be used to return data for a subset of one or several bands included in the file. These can be specified either as integer indices or as a character vector if bands are named.

  • "cell" can be used to return data for a subset of cells. Note that integer indices start counting at 1, whereas character indices start counting at the value of firstcell (usually 0).

Value

An LPJmLData object.

Examples

## Not run: 
# First case: meta file. Reads meta information from "my_file.json" and
# data from binary file linked in "my_file.json". Normally does not require
# any additional arguments.
my_data <- read_io("my_file.json")

# Suppose that file data has two bands named "wheat" and "rice". `band_names`
# are included in the JSON meta file. Select only the "wheat" band during
# reading and discard the "rice" band. Also, read only data for years
# 1910-1920.
my_data_wheat <- read_io(
  "my_file.json",
  subset = list(band = "wheat", year = as.character(seq(1910, 1920)))
)

# Read data from clm file. This includes a header describing the file
# structure.
my_data_clm <- read_io("my_file.clm")

# Suppose that "my_file.clm" has two bands containing data for "wheat" and
# "rice". Assign names to them manually since the header does not include a
# `band_names` attribute.
my_data_clm <- read_io("my_file.clm", band_names = c("wheat", "rice"))

# Once `band_names` are set, subsetting by name is possible also for
# file_type = "clm"
my_data_wheat <- read_io(
  "my_file.clm",
  band_names = c("wheat", "rice"),
  subset = list(band = "wheat", year = as.character(seq(1910, 1920)))
)

# Read data from raw binary file. All information about file structure needs
# to be supplied. Use default values except for nyear (1 by default), and
# nbands (also 1 by default).
my_data <- read_io("my_file.bin", nyear = 100, nbands = 2)

# Supply band_names to be able to subset by name
my_data_wheat <- read_io(
  "my_file.bin",
  band_names = c("wheat", "rice"), # length needs to correspond to `nbands`
  subset = list(band = "wheat", year = as.character(seq(1910, 1920))),
  nyear = 100,
  nbands = 2,
)

## End(Not run)

lpjmlkit documentation built on March 31, 2023, 9:35 p.m.