estimate.quantiles: Estimate quantiles of a climate time series

Description Usage Arguments Details Value Loading basis functions On get.quantiles parameters On processing efficiency WARNING WHEN BOOTSTRAPPING

View source: R/wrapper_get_quantiles.R

Description

This is a wrapper for get.quantiles that allows for file management, batch processing, bootstrapping, and saving of output, if all the directory and filename/structure defaults are followed, for the estimation of quantiles across model time series, with potentially multiple runs of data over the same time frame.

Usage

1
2
estimate.quantiles(defaults, log = T, max.runtime = 4 * 60 * 60,
  assumed.avg.processing.time = 160, process.inputs = list())

Arguments

defaults

the output from set.defaults, used to set input parameters (see Section "On get.quantiles parameters," below)

log

whether to log the output using sink() in the directory [defaults$aux.data.dir]/run_logs/. By default, true. Log filenames are (using run start time): estimate_quantiles_run_YYYY-MM-DD_HH-MM-SS.txt.

max.runtime

if batch processing on a server with a maximum allocated runtime, you can set it here, and the run will stop when the next block of time might take longer to run than the remaining allocated time. This prevents empty, un-processed temporary output files from preventing the complete processing of all data 'blocks'. Explicitly, the next block isn't processed and the function run is interrupted if the current system time if start.time + max.runtime - Sys.time() < (2 * 160)*[num pixels in block] seconds, assuming that it takes roughly 160 seconds per pixel to run this code (for 40 runs, 121 years of data, on the server this code was written on etc.); this time assumption can be changed with the assumed.avg.processing.time option, detailed below. Set to 0 if you don't want this feature interefering.

assumed.avg.processing.time

by default 160 (seconds), the estimated processing time per pixel. Used in interrupting batch run if time is running out.

process.inputs

add custom process.inputs list. By default, the code attempts to load [defaults$mod.data.dir]/process_inputs.RData or tries to generate its own using get.process.chunks. If this is not desired (i.e. you want to just process a subset of process chunks), put a subsetted output of get.process.chunks here , or make your own - just make sure that it's a list of process chunks, with every element containing at least the fields [global_loc] (used to load the correct [params] file), [lat] (just one lat, by lat band), [lon] (all the lon values desired), and [fn] (the raw data filename from which the params were calculated).

Details

estimate.quantiles runs get.quantiles on a pixel-by-pixel basis. These pixels are loaded and processed on a latitude-by-region/subset basis - in other words, a 'block' of pixels is every pixel in one subset/region (determined by a separate .nc file in the mod.data.dir set by the set.defaults function) with the same latitude. This 'block' of pixels is loaded and sent through get.quantiles one at a time; the resultant coefficients are saved in files 'block' by 'block'. The 'blocks' are identified through the saved output of get.process.chunks.

Value

Nothing. Output is saved instead.

Loading basis functions

This function assumes that needed basis functions have already been created using get.predictors and attempts to load them (this is the most computationally efficient way of dealing with them). It otherwise regenerates them.

On get.quantiles parameters

Inputs to get.quantiles are governed through the defaults object, generated by the set.defaults function. These include which quantiles to estimate (q_norm, q_bulk, q_tail), how many degrees of freedom to use in the predictor variables (df.x, df.t, df.xt), what years to process (year.range), and whether to add the volcanic forcing data (get.volc). See get.quantiles and set.defaults for details. Additionally, bootstrap/uncertainty quantification parameters(bootstrapping, nboots, and block.size) are set; these are used in this wrapper to govern bootstrapping.

On processing efficiency

estimate.quantiles does not explicitly support parallel processing within the function (various mclapply instances were tried without promising results). However, the creation of temporary output files at the start of processing and built-in testing for the existance of the output files means that this function can be called from multiple instances of R (for example in running in batch) without overwriting work done by other instances.

This is the most computationally-intensive step in the package.

WARNING WHEN BOOTSTRAPPING

The loaded model data is copied nboots times within this code if defaults$bootstrapping=T. Make sure to reduce the size of processing chunks when running with bootstrapping turned on to reduce the risk of memory issues... (some future version of this code may have a more sophisticated treatment of this issue)


ks905383/quantproj documentation built on Nov. 1, 2020, 9:12 p.m.