return_curve_diag: Evaluates the goodness of fit of the return curve estimates

View source: R/return_curve_diag.R

return_curve_diagR Documentation

Evaluates the goodness of fit of the return curve estimates

Description

The procedure calculates the empirical probability of observing data within the survival regions defined by a subset of points on the return curve. If the curve is a good fit, the empirical probabilities should closely match the probabilities associated with the return level curve. The procedure which is introduced in Murphy-Barltrop et al. (2023) uses bootstrap resampling of the original data set to obtain confidence intervals for the empirical estimates.

Usage

return_curve_diag(
  data,
  q,
  rp,
  mu,
  n_sim,
  n_grad,
  n_boot,
  boot_method,
  boot_replace,
  block_length,
  boot_prop,
  decl_method_x,
  decl_method_y,
  window_length_x,
  window_length_y,
  u_x = NA,
  u_y = NA,
  sep_crit_x = NA,
  sep_crit_y = NA,
  boot_method_all = "block",
  boot_replace_all = NA,
  block_length_all = 14,
  boot_prop_all = 0.8,
  alpha = 0.1,
  x_lab = NA,
  y_lab = NA,
  x_lim_min = min(data_df[, 2], na.rm = T),
  x_lim_max = max(data_df[, 2], na.rm = T) + 0.3 * diff(range(data[, 2], na.rm = T)),
  y_lim_min = min(data[, 3], na.rm = T),
  y_lim_max = max(data[, 3], na.rm = T) + 0.3 * diff(range(data[, 2], na.rm = T))
)

Arguments

data

Data frame of raw data detrended if necessary. First column should be of class Date.

q

Numeric vector of length one specifying quantile level for fitting GPDs and the HT04 and WT13 models.

rp

Numeric vector of length one specifying return period of interest.

mu

Numeric vector of length one specifying the (average) occurrence frequency of events in Data. Default is 365.25, daily data.

n_sim

Numeric vector of length one specifying the number of simulations for HT model. Default is 50.

n_grad

Numeric vector of length one specifying number of number of rays along which to compute points on the curve. Default is 50.

n_boot

Numeric vector of length one specifying number of bootstrap samples. Default is 100.

boot_method

Character vector of length one specifying the bootstrap method. Options are "basic" (default), "block" or "monthly".

boot_replace

Character vector of length one specifying whether simple bootstrapping is carried out with "T" or without "F" replacement. Only required if boot_method = "basic". Default is NA.

block_length

Numeric vector of length one specifying block length. Only required if boot_method = "block". Default is NA.

boot_prop

Numeric vector of length one specifying the minimum proportion of non-missing values of at least of the variables for a month to be included in the bootstrap. Only required if boot_method = "monthly". Default is 0.8.

decl_method_x

Character vector of length one specifying the declustering method to apply to the first variable. Options are the storm window approach "window" (default) and the runs method "runs".

decl_method_y

Character vector of length one specifying the declustering method to apply to the second variable. Options are the storm window approach "window" (default) and the runs method "runs".

window_length_x

Numeric vector of length one specifying the storm window length to apply during the declustering of the first variable if decl_method_x = "window".

window_length_y

Numeric vector of length one specifying the storm window length to apply during the declustering of the second variable if decl_method_y = "window".

u_x

Numeric vector of length one specifying the threshold to adopt in the declustering of the first variable if decl_method_x = "runs". Default is NA.

u_y

Numeric vector of length one specifying the threshold to adopt in the declustering of the second variable if decl_method_y = "runs". Default is NA.

sep_crit_x

Numeric vector of length one specifying the separation criterion to apply during the declustering of the first variable if decl_method_x = "runs". Default is NA.

sep_crit_y

Numeric vector of length one specifying the separation criterion to apply during the declustering of the second variable if decl_method_y = "runs". Default is NA.

boot_method_all

Character vector of length one specifying the bootstrapping procedure to use when estimating the distribution of empirical (survival) probabilities from the original dataset (without any declustering). Options are "basic" (default) and "block".

boot_replace_all

Character vector of length one specifying whether bootstrapping of original dataset (without any declustering) when estimating the distribution of empirical (survival) probabilities is carried out with "T" or without "F" replacement. Only required if boot_method_all = "basic". Default is NA.

block_length_all

Numeric vector of length one specifying block length. Only required if boot_method_all = "block". Default is 14.

alpha

Numeric vector of length one specifying the 100(1-alpha)% confidence interval. Default is 0.1.

x_lab

Character vector specifying the x-axis label.

y_lab

Character vector specifying the y-axis label.

x_lim_min

Numeric vector of length one specifying x-axis minimum. Default is NA.

x_lim_max

Numeric vector of length one specifying x-axis maximum. Default is NA.

y_lim_min

Numeric vector of length one specifying y-axis minimum. Default is NA.

y_lim_max

Numeric vector of length one specifying y-axis maximum. Default is NA.

Value

List comprising the angles "ang_ind" associated with the points on the curve for which the empirical probability estimates were calculated. For the HT04 model: Median "med_x_ht04", lower "lb_x_ht04" and upper "ub_x_ht04" bounds associated with the probabilities calculated using the sample conditioned on the first variable. Median "med_y_ht04", lower "lb_y_ht04" and upper "ub_y_ht04" bounds associated with the probabilities calculated using the sample conditioned on the second variable. Median "med_ht04", lower "lb_ht04" and upper "ub_ht04" bounds associated with the original dataset (without any declustering).

For the WT13 model: Median "med_x_wt13", lower "lb_x_wt13" and upper "ub_x_wt13" bounds associated with the probabilities calculated using the sample conditioned on the first variable. Median "med_y_wt13", lower "lb_y_wt13" and upper "ub_y_wt13" bounds associated with the probabilities calculated using the sample conditioned on the second variable. Median "med_wt13", lower "lb_wt13" and upper "ub_wt13" bounds associated with the original dataset (without any declustering).

Details

The HT04 model is fit to two conditional samples. One sample comprises the declustered time series of the first variable paired with concurrent values of the other variable. The second sample is obtained in the same way but with the variables reversed. The empirical probabilities are calculated using these two conditional samples and the original dataset (without any declustering). The return period should be chosen to ensure there is sufficient data for estimating empirical probabilities, yet the curve is sufficiently 'extreme'. An example could be to consider the fit using the 1 year return period curve rather than the 100 year return period curve.

Examples

#' #Data starts on first day of 1948
head(S22.Detrend.df)

#Dataframe ends on 1948-02-03
tail(S22.Detrend.df)

#Adding dates to complete final month of combined records
final.month = data.frame(seq(as.Date("2019-02-04"),as.Date("2019-02-28"),by="day"),NA,NA,NA)
colnames(final.month) = c("Date","Rainfall","OsWL","Groundwater")
S22.Detrend.df.extended = rbind(S22.Detrend.df,final.month)
#Derive return curves
return_curve_diag(data=S22.Detrend.df.extended[,1:3],
                  q=0.985,rp=1,mu=365.25,n_sim=100,
                  n_grad=50,n_boot=100,boot_method="monthly",
                  boot_replace=NA, block_length=NA, boot_prop=0.8,
                  decl_method_x="runs", decl_method_y="runs",
                  window_length_x=NA,window_length_y=NA,
                  u_x=0.95, u_y=0.95,
                  sep_crit_x=36, sep_crit_y=36,
                  alpha=0.1, x_lab=NA, y_lab=NA,
                  boot_method_all="block", boot_replace_all=NA,
                  block_length_all=14)

rjaneUCF/MultiHazard documentation built on March 29, 2025, 3:22 p.m.