tdm_uncertain: Uncertainty and sensitivity analysis
In TREXr: Tree Sap Flow Extractor

Description Usage Arguments Details Value References Examples

Quantifies the induced uncertainty on SFD and K time series due to the variability in input parameters applied during TDM data processing. Moreover, it applies a global sensitivity analysis to quantify the impact of each individual parameter on three relevant outputs derived from SFD and K, namely: i) the mean daily sum of water use, ii) the variability of maximum daily SFD or K values, iii) and the duration of daily sap flow. This function provides both the uncertainty and sensitivity indices, as time-series of SFD and K with the mean, standard deviation (sd) and confidence interval (CI) due to parameter uncertainty. Users should ensure that no gaps are present within the input data and environmental time series.

tdm_uncertain(
  input,
  vpd.input,
  sr.input,
  method = "pd",
  n = 2000,
  zero.end = 8 * 60,
  range.end = 16,
  zero.start = 1 * 60,
  range.start = 16,
  probe.length = 20,
  sw.cor = 32.28,
  sw.sd = 16,
  log.a_mu = 4.085,
  log.a_sd = 0.628,
  b_mu = 1.275,
  b_sd = 0.262,
  max.days_min = 1,
  max.days_max = 7,
  ed.window_min = 8,
  ed.window_max = 16,
  criteria.vpd_min = 0.05,
  criteria.vpd_max = 0.5,
  criteria.sr_mean = 30,
  criteria.sr_range = 30,
  criteria.cv_min = 0.5,
  criteria.cv_max = 1,
  min.sfd = 0.5,
  min.k = 0,
  make.plot = TRUE,
  df = FALSE,
  ncores
)

`input`	An `is.trex`-compliant object (`zoo` object
`vpd.input`	An `is.trex`-compliant object (`zoo` object, `data.frame`) containing a timestamp and a vapour pressure deficit (VPD; in kPa) column with the same temporal extent and time steps as the `input` object. This input is required when using the environmental dependent (`"ed"`) method.
`sr.input`	An `is.trex`-compliant object (`zoo` object, `data.frame`) a timestamp and a solar radiation data (sr; e.g., global radiation or PAR) column with the same temporal extent and time steps as the `input` object. This input is required when using the environmental dependent (`"ed"`) method.
`method`	Character, specifies the Δ Tmax method on which the sensitivity and uncertainty analysis are to be performed on (see `tdm_dt.max`). Only one method can be selected, including the pre-dawn (`"pd"`), moving window (`"mw"`), double regression (`"dr"`) or the environmental dependent (`"ed"`) method (default = `"pd"`).
`n`	Numeric, specifies the number of times the bootstrap resampling procedure is repeated (default = 2000). Keep in mind that high values increase processing time.
`zero.end`	Numeric, defines the end of the predawn period. Values should be in minutes (e.g., predawn conditions until 8:00 = 860; default = 860).
`range.end`	Numeric, defines the number of time steps for `zero.end` (the minimum time step of the input) for which an integer sampling range will be defined (default = 16, assuming a 15-min resolution or a 2 hour range around `zero.end`).
`zero.start`	Numeric, defines the start of the predawn period. Values should be in minutes (e.g., predawn conditions from 1:00 = 160; default = 160).
`range.start`	Numeric, defines the number of time steps for `zero.start` (the minimum time step of the input) for which an integer sampling range will be defined (default = 16, assuming a 15-min resolution or a 2 hour range around `zero.start`).
`probe.length`	Numeric, the length of the TDM probes in mm (see `tdm_hw.cor`; default = 20 mm).
`sw.cor`	Numeric, the sapwood thickness in mm. Default conditions assume the sapwood thickness is equal to a standard probe length (default = 20).
`sw.sd`	Numeric, the standard deviation for sampling sapwood thickness sampling from a normal distribution (default = 16 mm; defined with a European database on sapwood thickness measurements).
`log.a_mu`	Numeric, value providing the natural logarithm of the calibration parameter a (see `tdm_cal.sfd`; SFD = aK^b). This value can be obtained from `tdm_cal.sfd` (see `out.param`). Default conditions are determined by using all calibration data as described in `cal.data` (default = 4.085).
`log.a_sd`	Numeric, the standard deviation of the a parameter (see `log.a_mu`) used within the calibration curve for calculating SFD (default = 0.628).
`b_mu`	Numeric, the value of the calibration parameter b (see `tdm_cal.sfd`; SFD = aK^b). This value can be obtained from `tdm_cal.sfd` (see `out.param`). Default conditions are determined by using all calibration data as described in `cal.data` (default = 1.275).
`b_sd`	Numeric, the standard deviation of the b parameter (see `log.a_mu`) used within the calibration curve for calculating SFD (default = 0.262).
`max.days_min`	Numeric, the minimum value for an integer sampling range of `max.days` (see `tdm_dt.max` for the `"mw"` and `"dr"` Δ Tmax method). As the `"mw"` and `"dr"` method apply a rolling maximum or mean, the provided value should be an uneven number (see `tdm_dt.max`; default = 15; required for the `"mw"` and `"dr"` Δ Tmax method).
`max.days_max`	Numeric, the maximum value for an integer sampling range of `max.days` (see `tdm_dt.max` for the `"mw"` and `"dr"` Δ Tmax method). As the `"mw"` and `"dr"` method apply a rolling maximum or mean, the provided value should be an uneven number (see `tdm_dt.max`; default = 5; required for the `"mw"` and `"dr"` Δ Tmax method).
`ed.window_min`	Numeric, the minimum number of time steps for the `ed.window parameter` (see `tdm_dt.max`; the minimum time step of the input) for which an integer sampling range will be defined (default = 8, assuming a 15-min resolution or a 2 hour range; required for the `"ed"` Δ Tmax method).
`ed.window_max`	Numeric, the maximum number of time steps for the `ed.window` sampling range (default = 16, assuming a 15-min resolution or a 4 hour range; required for the `"ed"` Δ Tmax method).
`criteria.vpd_min`	Numeric, value in kPa defining the minimum for the fixed sampling range to define the vapour pressure deficit (VPD) threshold to establish zero-flow conditions (default = 0.05 kPa; see `tdm_dt.max`; required for the `"ed"` Δ Tmax method).
`criteria.vpd_max`	Numeric, value in kPa defining the maximum for the fixed sampling range to define the VPD threshold to establish zero-flow conditions (default = 0.5 kPa; required for the `"ed"` Δ Tmax method).
`criteria.sr_mean`	Numeric value defining the mean `sr.input` value around which the fixed sampling range for the solar irradiance threshold should be established for defining zero-flow conditions (see `tdm_dt.max`; default = 30 W m-2; required for the `"ed"` Δ Tmax method).
`criteria.sr_range`	Numeric, the range (in %) around `criteria.sr_mean` for establishing the solar irradiance threshold (see `tdm_dt.max`; default = 30%; required for the `"ed"` Δ Tmax method).
`criteria.cv_min`	Numeric, value (in %) defining the minimum value for the fixed sampling range to determine the coefficient of variation (CV) threshold for establishing zero-flow conditions (default = 0.5%; see `tdm_dt.max`; required for the `"ed"` Δ Tmax method).
`criteria.cv_max`	Numeric, value (in %) defining the maximum value for the fixed sampling range to determine the coefficient of variation (CV) threshold for establishing zero-flow conditions (default = 1%; see `tdm_dt.max`; required for the `"ed"` Δ Tmax method).
`min.sfd`	Numeric, defines at which SFD (cm3 cm-2 h-1) zero-flow conditions are expected. This parameter is used to define the duration of daily sap flow based on SFD (default = 0.5 cm3 cm-2 h-1).
`min.k`	Numeric value defining at which K (dimensionless, -) zero-flow are expected. This parameter is used to define the duration of daily sap flow based on K (default = 0).
`make.plot`	Logical; If `TRUE`, a plot is generated presenting the sensitivity and uncertainty analyses output (default = `TRUE`).
`df`	Logical; If `TRUE`, output is provided in a `data.frame` format with a timestamp and a value column. If `FALSE`, output is provided as a zoo vector object (default = `FALSE`).
`ncores`	Numeric, number of cores to use for parallel processing. If missing, defaults to available cores - 1, or 1 if single-core machine.

Uncertainty and sensitivity analysis can be performed on TDM Δ T (or Δ V) measurements. The function applies a Monte Carlo simulation approach (repetition defined by n) to determine the variability in relevant output variables (defined as uncertainty) and quantifies the contribution of each parameter to this uncertainty (defined as sensitivity). To generate variability in the selected input parameters a Latin Hypercube Sampling is performed with a default or user defined range of parameter values per Δ Tmax method (see tdm_dt.max()). The sampling algorithm generates multiple sampling distributions, including an integer sampling range (for zero.start, zero.end, max.days, and ed.window), a continuous sampling range (criteria for sr, vpd and cv), and a normal distribution (for sw.cor and calibration parameters a and b). Within this algorithm no within-day interpolations are made between the Δ Tmax points (see tdm_dt.max, interpolate = FALSE). This approach ensures near-random sampling across different types of sampling distributions, while avoiding the need for increasing the number of replicates (which increases computation time). For the application of this approach one needs to; i) select the output of interest, ii) identify the relevant input parameters, and iii) determine the parameter range and distribution. For a given time-series three output variables are considered, calculated as the mean over the entire time-series, to be relevant, namely; i) mean daily sum of water use (or Sum, expressed in cm3 cm-2 d-1 for SFD and unitless for K), ii) the variability of maximum SFD or K values (or CV, expressed as the coefficient of variation in % as this alters climate response correlations), and iii) the duration of daily sap flow based on SFD or K (or Duration, expressed in hours per day dependent on a threshold, see min.sfd and min.k). A minimum threshold to define zero-flow SFD or K is required for the duration calculation as small variations in night-time SFD or K are present. All data-processing steps (starting with "tdm_") are incorporated within the function, excluding tdm_damp() due to the need for detailed visual inspection and significantly longer computation time.

For the sensitivity analysis the total overall sensitivity indices are determined according strategy originally proposed by Sobol' (1993), considering the improvements applied within the sensitivity R package. The method proposed by Sobol' (1993) is a variance-based sensitivity analysis, where sensitivity indices (dimensionless from 0 to 1) indicate the partial variance contribution by a given parameter over the total output variance (e.g., Pappas et al. 2013). This global sensitivity analysis facilitates the identification of key parameters for data-processing improvement and highlights methodological limitations. Users should keep in mind that parameter ranges represent a very critical component of any sensitivity analysis and should be critically assessed and clearly reported for each case and analytical purpose. Moreover, it is advised to run this function on one growing season of input data to reduce processing time.

A named list of zoo or data.frame objects in the appropriate format for other functionalities. Items include:

output.data: data.frame containing uncertainty and sensitivity indices for SFD and K and the included parameters. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"ci.min"] and lower [,"ci.max"] 95% confidence interval.
output.sfd: zoo object or data.frame with the SFD time series obtained from the bootstrap resampling. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"CIup"] and lower [,"CIlo"] 95% confidence interval.
output.k: zoo object or data.frame with the K time series obtained from the bootstrap resampling. This includes the mean uncertainty/sensitivity [,"mean"], standard deviation [,"sd"], upper [,"ci.max"] and lower [,"ci.min"] 95% confidence interval.
param: a data.frame with an overview of selected parameters used within tdm_uncertain() function.

Sobol' I. 1993. Sensitivity analysis for nonlinear mathematical models. Math. Model Comput. Exp. 1:407-414

Pappas C, Fatichi S, Leuzinger S, Wolf A, Burlando P. 2013. Sensitivity analysis of a process-based ecosystem model: Pinpointing parameterization and structural issues. Journal of Geophysical Research 118:505-528 doi: 10.1002/jgrg.20035

#perform an uncertainty and sensitivity analysis on "dr" data processing
raw   <- example.data(type="doy")
input <- is.trex(raw, tz="GMT", time.format="%H:%M",
           solar.time=TRUE, long.deg=7.7459, ref.add=FALSE, df=FALSE)
input<-dt.steps(input,time.int=15,start="2013-04-01 00:00",
             end="2013-11-01 00:00",max.gap=180,decimals=15)
output<- tdm_uncertain(input, probe.length=20, method="pd",
               n=2000,sw.cor=32.28,sw.sd=16,log.a_mu=3.792436,
               log.a_sd=0.4448937,b_mu=1.177099,b_sd=0.3083603,
               make.plot=TRUE, ncores = 2)