View source: R/return_curve_diag.R
return_curve_diag | R Documentation |
The procedure calculates the empirical probability of observing data within the survival regions defined by a subset of points on the return curve. If the curve is a good fit, the empirical probabilities should closely match the probabilities associated with the return level curve. The procedure which is introduced in Murphy-Barltrop et al. (2023) uses bootstrap resampling of the original data set to obtain confidence intervals for the empirical estimates.
return_curve_diag(
data,
q,
rp,
mu,
n_sim,
n_grad,
n_boot,
boot_method,
boot_replace,
block_length,
boot_prop,
decl_method_x,
decl_method_y,
window_length_x,
window_length_y,
u_x = NA,
u_y = NA,
sep_crit_x = NA,
sep_crit_y = NA,
boot_method_all = "block",
boot_replace_all = NA,
block_length_all = 14,
boot_prop_all = 0.8,
alpha = 0.1,
x_lab = NA,
y_lab = NA,
x_lim_min = min(data_df[, 2], na.rm = T),
x_lim_max = max(data_df[, 2], na.rm = T) + 0.3 * diff(range(data[, 2], na.rm = T)),
y_lim_min = min(data[, 3], na.rm = T),
y_lim_max = max(data[, 3], na.rm = T) + 0.3 * diff(range(data[, 2], na.rm = T))
)
data |
Data frame of raw data detrended if necessary. First column should be of class |
q |
Numeric vector of length one specifying quantile level for fitting GPDs and the HT04 and WT13 models. |
rp |
Numeric vector of length one specifying return period of interest. |
mu |
Numeric vector of length one specifying the (average) occurrence frequency of events in Data. Default is 365.25, daily data. |
n_sim |
Numeric vector of length one specifying the number of simulations for HT model. Default is |
n_grad |
Numeric vector of length one specifying number of number of rays along which to compute points on the curve. Default is |
n_boot |
Numeric vector of length one specifying number of bootstrap samples. Default is |
boot_method |
Character vector of length one specifying the bootstrap method. Options are |
boot_replace |
Character vector of length one specifying whether simple bootstrapping is carried out with |
block_length |
Numeric vector of length one specifying block length. Only required if |
boot_prop |
Numeric vector of length one specifying the minimum proportion of non-missing values of at least of the variables for a month to be included in the bootstrap. Only required if |
decl_method_x |
Character vector of length one specifying the declustering method to apply to the first variable. Options are the storm window approach |
decl_method_y |
Character vector of length one specifying the declustering method to apply to the second variable. Options are the storm window approach |
window_length_x |
Numeric vector of length one specifying the storm window length to apply during the declustering of the first variable if |
window_length_y |
Numeric vector of length one specifying the storm window length to apply during the declustering of the second variable if |
u_x |
Numeric vector of length one specifying the threshold to adopt in the declustering of the first variable if |
u_y |
Numeric vector of length one specifying the threshold to adopt in the declustering of the second variable if |
sep_crit_x |
Numeric vector of length one specifying the separation criterion to apply during the declustering of the first variable if |
sep_crit_y |
Numeric vector of length one specifying the separation criterion to apply during the declustering of the second variable if |
boot_method_all |
Character vector of length one specifying the bootstrapping procedure to use when estimating the distribution of empirical (survival) probabilities from the original dataset (without any declustering). Options are |
boot_replace_all |
Character vector of length one specifying whether bootstrapping of original dataset (without any declustering) when estimating the distribution of empirical (survival) probabilities is carried out with |
block_length_all |
Numeric vector of length one specifying block length. Only required if |
alpha |
Numeric vector of length one specifying the |
x_lab |
Character vector specifying the x-axis label. |
y_lab |
Character vector specifying the y-axis label. |
x_lim_min |
Numeric vector of length one specifying x-axis minimum. Default is |
x_lim_max |
Numeric vector of length one specifying x-axis maximum. Default is |
y_lim_min |
Numeric vector of length one specifying y-axis minimum. Default is |
y_lim_max |
Numeric vector of length one specifying y-axis maximum. Default is |
List comprising the angles "ang_ind"
associated with the points on the curve for which the empirical probability estimates were calculated.
For the HT04 model: Median "med_x_ht04", lower "lb_x_ht04"
and upper "ub_x_ht04"
bounds associated with the probabilities calculated using the sample conditioned on the first variable.
Median "med_y_ht04", lower "lb_y_ht04"
and upper "ub_y_ht04"
bounds associated with the probabilities calculated using the sample conditioned on the second variable.
Median "med_ht04", lower "lb_ht04"
and upper "ub_ht04"
bounds associated with the original dataset (without any declustering).
For the WT13 model: Median "med_x_wt13", lower "lb_x_wt13"
and upper "ub_x_wt13"
bounds associated with the probabilities calculated using the sample conditioned on the first variable.
Median "med_y_wt13", lower "lb_y_wt13"
and upper "ub_y_wt13"
bounds associated with the probabilities calculated using the sample conditioned on the second variable.
Median "med_wt13", lower "lb_wt13"
and upper "ub_wt13"
bounds associated with the original dataset (without any declustering).
The HT04 model is fit to two conditional samples. One sample comprises the declustered time series of the first variable paired with concurrent values of the other variable. The second sample is obtained in the same way but with the variables reversed. The empirical probabilities are calculated using these two conditional samples and the original dataset (without any declustering). The return period should be chosen to ensure there is sufficient data for estimating empirical probabilities, yet the curve is sufficiently 'extreme'. An example could be to consider the fit using the 1 year return period curve rather than the 100 year return period curve.
#' #Data starts on first day of 1948
head(S22.Detrend.df)
#Dataframe ends on 1948-02-03
tail(S22.Detrend.df)
#Adding dates to complete final month of combined records
final.month = data.frame(seq(as.Date("2019-02-04"),as.Date("2019-02-28"),by="day"),NA,NA,NA)
colnames(final.month) = c("Date","Rainfall","OsWL","Groundwater")
S22.Detrend.df.extended = rbind(S22.Detrend.df,final.month)
#Derive return curves
return_curve_diag(data=S22.Detrend.df.extended[,1:3],
q=0.985,rp=1,mu=365.25,n_sim=100,
n_grad=50,n_boot=100,boot_method="monthly",
boot_replace=NA, block_length=NA, boot_prop=0.8,
decl_method_x="runs", decl_method_y="runs",
window_length_x=NA,window_length_y=NA,
u_x=0.95, u_y=0.95,
sep_crit_x=36, sep_crit_y=36,
alpha=0.1, x_lab=NA, y_lab=NA,
boot_method_all="block", boot_replace_all=NA,
block_length_all=14)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.