In martindurocher/floodStat: FloodNet Regional Frequency Analysis

Introduction

In this document I will show how to use floodnetRfa to perform at-site flood frequency analysis using annual maximums discharge. First we will extract the annual maximums using the function ExtractAmax from the daily discharge of the Saint-John River at Fort Kent (NB). Note that incomplete years, i.e. with fewer than 365 days of observation are removed. A total 88 annual maximums are extracted.

library(floodnetRfa)

## Extract the annual maximums
anData <- ExtractAmax(flow ~ date, flowStJohn, tol = 365)

The annual maximums are assumed to be independent and identically distributed. Risk is measured in terms of return period $T$ that characterizes the average waiting time separating two events of the same magnitude. In practice this is equivalent to evaluate the flood quantile associated with the probability $p = 1-1/T$. The test of Mann-Kendall is frequently performed to verify if the data do not contain significant trends. The present data have a p-value superior to 0.05, which does not suggest the presence of a trend.

## Graph of the trend in annula maximums
plot(flow~date, anData)
fit <- with(anData, smooth.spline(date,flow))
lines(predict(fit), col = 'red', lwd = 2)
abline(h = mean(anData$flow), lty = 3, col = 'blue')

## Outcome of Mann-Kendall test
MKendall(anData$flow)

Estimation of the flood quantiles

According to extreme value theory, as the number of annual maximums increase, their distribution converge to a Generalized Extreme Value (GEV) distribution $$ F(x) = \exp\left{ - \left[ 1 - \kappa \left(\frac{x-\xi}{\alpha} \right) \right]^{1/\kappa} \right} $$

The GEV distribution can be fitted using the FitAmax function. The example below shows how the parameters are estimated using the maximum likelihood method. A brief summary of the fitted model is reported by the print function. It including the estimated parameter, their standard deviation and the sample L-moments.

fitMle <- FitAmax(anData$flow, 'gev', method = 'mle')
print(fitMle)

It is also possible to use the L-moment method to estimate the parameters. Note that by default FitAmax compute the covariance matrix of the parameters. For the maximum likelihood method the covariance matrix is derived from the evaluation of hessian matrix at the optimum. For the L-moment estimator, a parametric bootstrap is used and a correction is made to ensure that the covariance matrix is positive definite. The evaluation of the covariance matrix can be turned down by the arguments varcov = FALSE to speed up computation. In this case, the standard error won't be available.

## Fit GEV distribution using L-moments
fitLmm <-FitAmax(anData$flow, 'gev', method = 'lmom', varcov = FALSE)

The flood quantile of the GEV distribution can be obtained from the formula

$$ x_T = \mu + \frac{\alpha}{\xi}\left[ 1+\log(1/T)^\kappa \right] . $$ These predictions are computed using the predict function. By default return period of 2, 5, 10, 20, 50 and 100 years are provided. The example below shows how to obtain the flood quantile of return periods 10 and 100 years using the argument q = 1-1/T. The standard deviation of the flood quantiles is estimated using the Delta method when requested by se = TRUE. The latter is based on a normal approximation using the estimated covariance matrix. Confident intervals can be evaluated using the standard deviation derived from the delta method ci = 'delta'. By default the confident interval has a significant level alpha = 0.05.

## Flood quantiles with uncertainty
predict(fitMle, q = c(.9,.99), se = TRUE, ci = 'delta', alpha = .1)

Confident intervals can also be obtained using parametric bootstraps with the option ci = 'boot'. The argument out.matrix is useful in this case as it requests that the bootstrap samples of the parameters and the flood quantiles be returned.

out <- predict(fitLmm, q = c(.9,.99), ci = 'boot', nsim = 100, out.matrix = TRUE)

## Estimated Flood quantiles
print(out$pred)

## Bootstraps sample of model parameters
out$para[1:3,]

## Bootstraps sample of flood quantiles
print(out$qua[1:3,])

Verification of the model

The return level plot provides a visual assessment of the fitted distribution by comparing the sampled and estimated flood quantiles. The graphic below shows a good agreement between the two. The confident intervals are obtained using normal approximation based on the delta method.

## Return level plot
plot(fitMle, ci = TRUE)

Another diagnostic to ensure that the GEV distribution is appropriate is the Anderson-Darling test that verifies the hypothesis that the observations were generated by a given distribution. A p-value superior to 0.05 indicates that here the hypothesis of a GEV cannot be rejected at this significant level.

## Anderson-Darling goodness of fit test
GofTest(fitLmm, nsim = 500)

Selection of the best distribution

The GEV distribution is not the only distribution used in flood frequency analysis. Frequent alternative candidates are the shifted log-normal distribution ('ln3'), the Generalized logistic distribution ('glo') and the Pearson type III distribution ('pe3'). The latter is generally used on the logarithm of the observation (log-Pearson). These distribution can be fitted by the function FitAmax identically to the GEV.

## 
FitAmax(anData$flow, 'ln3', method = 'lmom', varcov =  FALSE)

A best distribution can be selected using the Akaike Information Criterion (AIC) as returned by the function AIC. However, the process of selecting the best ditribution can be simplify by the function FitAmax.auto, that will try all candidates and return only the best fit. In this exemple below, all candidates have similar AIC, which indicates that all distributions represent reasonable choices. The argument tol.gev can be used to ensure that the selected candidate has a AIC greater than the GEV distribution by a minimum margin, otherwise GEV will be prefered

## Function that compute the AIC for a given distribution
FAIC <- function(d)
   AIC(FitAmax(anData$flow, d, method = 'mle', varcov =  FALSE))

## AIC of all distribution
sapply(c('gev','glo','ln3','pe3'), FAIC)

## Automatic selection of the distribution
FitAmax.auto(anData$flow, method = 'mle', tol.gev = 2)

Conclusion

In this document we showed how to evaluate flood quantiles using annual maximum discharge. Initially, the extraction of the annual maximums (ExtractAmax) and the assessment of a constant mean (MKendall) were discussed. Instructions were provided to estimate (FitAmax) the model for different distributions and to select a best distribution using the AIC (FitAmax.auto). The function to obtain the flood quantiles and their uncertainties (predict) was presented. Finally, the return level plot (plot) was indicated as a common way to visualize the quality of the fitting. For a general introduction to the technical aspect of modeling annual maximums, the reader is referred to Coles (2001) or Hosking and Wallis (1997).