write_config: Write LPJmL config files (JSON)

View source: R/write_config.R

write_configR Documentation

Write LPJmL config files (JSON)

Description

Requires a tibble (modern data.frame class) in a specific format (see details & examples) to write the model configuration file "config_*.json". Each row in the tibble corresponds to a model run. The generated "config_*.json" is based on a js file (e.g. "lpjml_*.js").

Usage

write_config(
  x,
  model_path,
  sim_path = NULL,
  output_list = c(),
  output_list_timestep = "annual",
  output_format = "raw",
  js_filename = "lpjml.js",
  parallel_cores = 4,
  debug = FALSE,
  params = NULL,
  output_path = NULL
)

Arguments

x

A tibble in a defined format (see details).

model_path

Character string providing the path to LPJmL (equal to LPJROOT environment variable).

sim_path

Character string defining path where all simulation data are written. Also an output, a restart and a configuration folder are created in sim_path to store respective data. If NULL, model_path is used.

output_list

Character vector containing the "id" of outputvars. If defined, only these defined outputs will be written. Otherwise, all outputs set in js_filename will be written. Defaults to NULL.

output_list_timestep

Single character string or character vector defining what temporal resolution the defined outputs from output_list should have. Either provide a single character string for all outputs or a vector with the length of output_list defining each timestep individually. Choose between "annual", "monthly" or "daily".

output_format

Character string defining the format of the output. Defaults to "raw". Further options: "cdf" (NetCDF) or "clm" (file with header).

js_filename

Character string providing the name of the main js file to be parsed. Defaults to "lpjml.js".

parallel_cores

Integer defining the number of available CPU cores for parallelization. Defaults to 4.

debug

logical If TRUE, the inner parallelization is switched off to enable tracebacks and all types of error messages. Defaults to FALSE.

params

Argument is deprecated as of version 1.0; use x instead.

output_path

Argument is deprecated as of version 1.0; use sim_path instead.

Details

Supply a tibble for x, in which each row represents a configuration (config) for an LPJmL simulation.
Here a config is referred to as the precompiled "lpjml.js" file (or file name provided as js_filename argument), which links to all other mandatory ".js" files. The precompilation is done internally by write_config().
write_config() uses the column names of param as keys for the config json using the same syntax as lists, e.g. "k_temp" from "param.js" can be accessed with "param$k_temp" or ⁠"param[["k_temp"]]"⁠ as the column name. (The former point-style syntax - "param.k_temp" - is still valid but deprecated)
For each run and thus each row, this value has to be specified in the tibble. If the original value should instead be used, insert NA.
Each run can be identified via the "sim_name", which is mandatory to specify.

my_params1 <- tibble(
  sim_name = c("scenario1", "scenario2"),
  random_seed = c(42, 404),
  `pftpar[[1]]$name` = c("first_tree", NA),
  `param$k_temp` = c(NA, 0.03),
  new_phenology = c(TRUE, FALSE)
)

my_params1
# A tibble: 2 x 5
#   sim_name random_seed `pftpar[[1]]$name` `param$k_temp` new_phenology
#   <chr>          <dbl> <chr>                <dbl> <lgl>
# 1 scenario1         42 first_tree           NA    TRUE
# 2 scenario2        404 NA                    0.03 FALSE

Simulation sequences

To set up spin-up and transient runs, where transient runs are dependent on the spin-up(s), a parameter "dependency" has to be defined as a column in the tibble that links simulations with each other using the "sim_name".
Do not manually set "-DFROM_RESTART" when using "dependency". The same applies for LPJmL config settings "restart", "write_restart", "write_restart_filename", "restart_filename", which are set automatically by this function. This way multiple runs can be performed in succession and build a conceivably endless chain or tree.

# With dependent runs.
my_params3 <- tibble(
 sim_name = c("scen1_spinup", "scen1_transient"),
 random_seed = c(42, 404),
 dependency = c(NA, "scen1_spinup")
)
my_params3
# A tibble: 2 x 4
#   sim_name        random_seed order dependency
#   <chr>                 <int> <lgl> <chr>
# 1 scen1_spinup             42 FALSE NA
# 2 scen1_transient         404 TRUE  scen1_spinup

SLURM options

Another feature is to define SLURM options for each simulation (row) separately. For example, users may want to set a lower wall clock limit (wtime) for the transient run than the spin-up run to get a higher priority in the SLURM queue. This can be achieved by supplying this option as a parameter to param.
4 options are available, namely sclass, ntask, wtime, blocking.
If specified in param, they overwrite the corresponding function arguments in submit_lpjml().

my_params4 <- tibble(
 sim_name = c("scen1_spinup", "scen1_transient"),
 random_seed = c(42, 404),
 dependency = c(NA, "scen1_spinup"),
 wtime = c("8:00:00", "2:00:00")
)

my_params4
# A tibble: 2 x 5
#   sim_name        random_seed order dependency   wtime
#   <chr>                 <int> <lgl> <chr>        <chr>
# 1 scen1_spinup             42 FALSE NA           8:00:00
# 2 scen1_transient         404 TRUE  scen1_spinup 2:00:00

Use of macros

To set a macro (e.g. "MY_MACRO" or "CHECKPOINT") provide it as a column of the tibble as you would do with a flag in the shell: "-DMY_MACRO" "-DCHECKPOINT".
Wrap macros in backticks or tibble will raise an error, as starting an object definition with "-" is not allowed in R.

my_params2 <- tibble(
  sim_name = c("scen1_spinup", "scen1_transient"),
  random_seed = c(42, 404),
  `-DMY_MACRO` = c(TRUE, FALSE),
)

my_params2
# A tibble: 2 x 3
#   sim_name        random_seed `-DMY_MACRO`
#   <chr>                 <int> <lgl>
# 1 scen1_spinup             42 TRUE
# 2 scen1_transient         404 FALSE

In short

  • write_config() creates subdirectories within the sim_path directory

    • "./configurations" to store the config files.

    • "./output" to store the output within subdirectories for each sim_name.

    • "./restart" to store the restart files within subdirectories for each sim_name.

  • The "." syntax (e.g. "pftpar.1.name") allows to create column names and thus keys for accessing values in the config json.

  • The column "sim_name" is mandatory (used as an identifier).

  • The run parameter "dependency" is optional but enables interdependent consecutive runs using submit_lpjml().

  • SLURM options in param allow to use different values per run.

  • If NA is specified as cell value the original value is used.

  • R booleans/logical constants TRUE and FALSE are to be used for boolean parameters in the config json.

  • Value types need to be set correctly, e.g. no strings where numeric values are expected.

Value

tibble with at least one column named "sim_name". Run parameters "order" and "dependency" are included if defined in x. tibble in this format is required for submit_lpjml().

Examples

## Not run: 
library(tibble)

model_path <- "./LPJmL_internal"
sim_path <-"./my_runs"


# Basic usage
my_params <- tibble(
  sim_name = c("scen1", "scen2"),
  random_seed = c(12, 404),
  `pftpar[[1]]$name` = c("first_tree", NA),
  `param$k_temp` = c(NA, 0.03),
  new_phenology = c(TRUE, FALSE)
)

config_details <- write_config(
  x = my_params,
  model_path = model_path,
  sim_path = sim_path
)

config_details
# A tibble: 2 x 1
#   sim_name
#   <chr>
# 1 scen1
# 2 scen2

# Usage with dependency
my_params <- tibble(
 sim_name = c("scen1_spinup", "scen1_transient"),
 random_seed = c(42, 404),
 dependency = c(NA, "scen1_spinup")
)

config_details <- write_config(
  x = my_params,
  model_path = model_path,
  sim_path = sim_path
)

config_details
# A tibble: 2 x 3
#   sim_name        order dependency
#   <chr>           <dbl> <chr>
# 1 scen1_spinup        1 NA
# 2 scen1_transient     2 scen1_spinup


my_params <- tibble(
 sim_name = c("scen1_spinup", "scen1_transient"),
 random_seed = c(42, 404),
 dependency = c(NA, "scen1_spinup"),
 wtime = c("8:00:00", "2:00:00")
)

config_details <- write_config(
  x = my_params,
  model_path = model_path,
  sim_path = sim_path
)

config_details
# A tibble: 2 x 4
#   sim_name        order dependency   wtime
#   <chr>           <dbl> <chr>        <chr>
# 1 scen1_spinup        1 NA           8:00:00
# 2 scen1_transient     2 scen1_spinup 2:00:00


## End(Not run)

lpjmlkit documentation built on March 31, 2023, 9:35 p.m.