In martindurocher/floodStat: FloodNet Regional Frequency Analysis

Introduction

In this document I will show how to use floodnetRfa to perform at-site flood frequency analysis using annual maximums discharge. First we will extract the annual maximums using the function ExtractAmax from the daily discharge of the Saint-John River at Fort Kent (NB). Note that incomplete years, i.e. with fewer than 365 days of observation are removed. A total 88 annual maximums are extracted.

library(floodnetRfa)

## Extract the annual maximums
anData <- ExtractAmax(flow ~ date, flowStJohn, tol = 365)

The annual maximums are assumed to be independent and identically distributed. Risk is measured in terms of return period $T$ that characterizes the average waiting time separating two events of the same magnitude. In practice this is equivalent to evaluate the flood quantile associated with the probability $p = 1-1/T$. The test of Mann-Kendall is frequently performed to verify if the data do not contain significant trends. This test can be performed using the function MKendall, the present data have a p-value superior to 0.05, which does not suggest the presence of a trend. In presence of autocorrelation, the significance of the test is evaluated using block bootstrap (Önöz and Bayazit, 2012) where the lenght of the blocks are one more than the last significant lag.

## Graph of the trend in annula maximums
plot(flow~date, anData)
fit <- with(anData, smooth.spline(date,flow))
lines(predict(fit), col = 'red', lwd = 2)
abline(h = mean(anData$flow), lty = 3, col = 'blue')

## Outcome of Mann-Kendall test
MKendall(anData$flow)

Estimation of the flood quantiles

According to extreme value theory, as the number of annual maximums increase, their distribution converge to a Generalized Extreme Value (GEV) distribution $$ F(x) = \exp\left{ - \left[ 1 - \kappa \left(\frac{x-\xi}{\alpha} \right) \right]^{1/\kappa} \right} $$

The GEV distribution can be fitted using the FitAmax function. The example below shows how the parameters are estimated using the maximum likelihood method. A summary of the estimated parameter, their standard deviation and the sample L-moments is returned.

fitMle <- FitAmax(anData$flow, 'gev', method = 'mle')
print(fitMle)

It is also possible to use the L-moment method to estimate the parameters. Note that by default FitAmax compute the covariance matrix of the parameters. For the maximum likelihood method the covariance matrix is derived from the evaluation of hessian matrix at the optimum, while for the L-moment estimator a parametric bootstrap is used. Note that a correction is made to ensure that the covariance matrix is positive definite when bootstrap is used. See function nearPD. The evaluation of the covariance matrix can be avoid by specifying the arguments varcov = FALSE and In this case, the standard error won't be available.

## Fit GEV distribution using L-moments
fitLmm <-FitAmax(anData$flow, 'gev', method = 'lmom', varcov = FALSE)

The flood quantile of the GEV distribution can be obtained from the formula

$$ x_T = \mu + \frac{\alpha}{\xi}\left[ 1+\log(1/T)^\kappa \right] . $$ These predictions are computed using the predict function. By default return period of 2, 5, 10, 20, 50 and 100 years are provided. The example below shows how to obtain the flood quantile of return periods 10 and 100 years using the argument q = 1-1/T. The standard deviation of the flood quantiles is estimated using the Delta method when requested by se = TRUE. The latter is based on a normal approximation using the estimated covariance matrix. Similarly, confident intervals can also be evaluated using the standard deviation derived from the delta method ci = 'delta'.

## Flood quantiles with uncertainty
predict(fitMle, q = c(.9,.99), se = TRUE, ci = 'delta', alpha = .1)

Confident intervals can also be obtained using parametric bootstraps with the option ci = 'boot'. The argument out.matrix is useful in this case as it requests that the bootstrap samples of the parameters and the flood quantiles be returned.

out <- predict(fitLmm, q = c(.9,.99), ci = 'boot', nsim = 100, out.matrix = TRUE)

## Estimated Flood quantiles
print(out$pred)

## Bootstraps sample of model parameters
out$para[1:3,]

## Bootstraps sample of flood quantiles
print(out$qua[1:3,])

Verification of the model

The return level plot provides a visual assessment of the fitted distribution by comparing the sampled and estimated flood quantiles. The graphic below shows a good agreement between the two. The confident intervals are obtained using normal approximation based on the delta method.

## Return level plot
plot(fitMle, ci = TRUE)

Another diagnostic to ensure that the GEV distribution is appropriate is the Anderson-Darling test that verifies if the hypothesis that the observations were generated by a GEV distribution. A p-value superior to 0.05 indicates that here the hypothesis of a GEV cannot be rejected at this significant level.

## Anderson-Darling goodness of fit test
GofTest(fitLmm, nsim = 500)

Selection of the best distribution

The GEV distribution is not the only distribution used in flood frequency analysis. Frequent alternative candidates are the shifted log-normal distribution ('ln3'), the Generalized logistic distribution ('glo') and the Pearson type III distribution ('pe3'). The latter is generally used at the logarithm scale (log-Pearson). These distribution can be fitted by the function FitAmax identically to the GEV.

## 
FitAmax(anData$flow, 'ln3', method = 'lmom', varcov =  FALSE)

A best distribution can be selected using the Akaike Information Criterion (AIC) as returned by the function AIC. However, the process of selecting the best ditribution can be simplify by the function FitAmax.auto, that will try all candidates and return only the best fit. In this exemple below, all candidates have similar AIC, which indicates that all distributions represent reasonable choices. The argument tol.gev can be used to ensure that the selected candidate has a AIC greater than the GEV distribution by a minimal margin, otherwise GEV will be prefered.

## Function that compute the AIC for a given distribution
FAIC <- function(d)
   AIC(FitAmax(anData$flow, d, method = 'mle', varcov =  FALSE))

## AIC of all distribution
sapply(c('gev','glo','ln3','pe3'), FAIC)

## Automatic selection of the distribution
FitAmax.auto(anData$flow, method = 'mle', tol.gev = 2)

Conclusion

In this document we showed how to evaluate flood quantiles using annual maximum river discharge. Initially, the extraction of the annual maximums (ExtractAmax) and the assessment of a constant mean (MKendall) were discussed. Instructions were provided to estimate (FitAmax) the model for different distributions and to select a best distribution using the AIC (FitAmax.auto). The function to obtain the flood quantiles and their uncertainties (predict) was presented. Finally, the return level plot (plot) was presented as a common way to visualize the quality of the fitting. For a general introduction to the technical aspect of modeling annual maximums, the reader is referred to Coles (2001) or Hosking and Wallis (1997).

References

Coles, S., 2001. An introduction to statistical modeling of extreme values. Springer Verlag.

Hosking, J.R.M., Wallis, J.R., 1997. Regional frequency analysis: an approach based on L-moments. Cambridge Univ Pr.

Önöz, B., Bayazit, M., 2012. Block bootstrap for Mann–Kendall trend test of serially dependent data. Hydrol. Process. 26, 3552–3560. https://doi.org/10.1002/hyp.8438

martindurocher/floodStat documentation built on May 31, 2019, 12:42 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com