In martindurocher/floodStat: FloodNet Regional Frequency Analysis

UNFINISHED

Introduction

In this document I will show how to use floodnetRfa to perform regional flood frequency analysis using the index-flood model and pooling groups. The methodology is presented using 45 hydrometric sites in the Atlantic provinces of Canada. The dataset flowAtlantic contains the annual maximum river discharge ams and basic catchment descriptors covariate, .i.e. drainage area, mean annual precipitation. First the Euclidean distance is computed between the catchment descriptors, which will serve as measure of similarity.

attach(flowAtlantic) 
covar <- scale(log(covariate[,c('area','map')]))
distance.covar <- as.matrix(dist(covar))

Intersite correlation

Important meteorological events can generate floods that will be monitored in multiple sites. Such spatial dependence creates intersite correlation that will the variability of the index-flood model. A first step is therefore to evaluate the correlation between pairs of sites, which can be done by the function Intersite. A Normal copula is employed to describe the pairwise correlation between the sites. The property $$ \rho_{i,j} = \frac{6}{\pi} \arcsin\frac{\theta_{i,j}}{2} $$ allows the estimation of correlation coefficient $\theta_{i,j}$ of the Normal copula between the ith and jth site from the Spearman rank correlation $\rho_{i,j}$ and additional correction are added to ensure that the final estimation lead to a positive definite matrix (Higham, 2002).

The example below shows how to evalutate the intersite correlation for the Atlantic site. The correlation coefficient are estimated based on limited number of paired observations, which lead to large uncertainties. Alternatively, the example below show that the intersite correlation can be smoothed by a power exponential model $$ \widehat{\theta}_{i,j} = \begin{cases} (1-\tau) \exp\left(-3 \left|\frac{h}{\gamma}\right|^p \right) & h > 0 \ 1 & h = 0\ \end{cases} $$ that characterizes the strength of the intersite correlation with respect to the distance $h$. The parameter $\tau \in [0,1]$ describe a nugget effect that correspond to a jump at the origin, $\gamma$ is the range parameter that represent how far the distance have a effect on the correlation. The last parameter p is fixed and provides additonal control on the rate at which the correlation declay in respect of the distance. Note that the order of the columns in the distance matrix passed as argument to Intersite must be in the same order as

## Empricial intersite correlation
inter.emp <- Intersite(ams ~ id + year, ams)

## Smoothed intersite correlation
inter.exp <- Intersite(ams ~ id + year, ams, 
                       distance = distance, 
                       smooth = TRUE)

## Draw the graphic of the inter site correlation with res
pairs.id <- lower.tri(inter.emp$corr)
pairs.emp <- inter.emp$corr[pairs.id]
pairs.exp <- inter.exp$corr[pairs.id]
pairs.distance <- distance[pairs.id]

plot(pairs.distance, pairs.emp, 
     xlab = 'Distance (km)', 
     ylab = 'Intersite correlation')

points(pairs.distance, pairs.exp, col = 'red', pch = 16)

Pooling groups using annual maximums

A pooling groups is a set of sites formed around a target site and that includes its nearby sites. A index-flood model is applied inside a pooling group where it is assumed that all sites have the same distribution up to a scale factor. The regional distribution can be estimated from the regional L-moments that are evaluated as the regional average of the at-site sample L-moments. The latter were also estimated from the function Intersite.

The exemple below compute the regional L-moments and parameters of the regional distribution for 20 sites forming a pooling group around the site "01AQ001" located on the Lepreau river in southern New Brunswick. In this exemple the Generalized Extreme Value (GEV) distribution is used as a model for the regional distribution.

PoolGroup(inter.exp, distr = 'gev', nk = 20,
          distance = distance.covar[12,], diagnostic = TRUE)

The hypothesis of a common distribution is verify using a heterogeneity measure. The statistics $$ H = \frac{V-\mu_V}{\sigma_V} $$ was proposed by Hosking and Wallis (1997). Here, $V$ represent the L-coefficient of variation (LCV) of the pooling groups. The average $\mu_V$ and standard deviation $\sigma_V$ of the LCV are obtained by the simulation of a index-flood model using a four parameters kappa distribution. A pooling group with $H < 1$ is described as acceptably homogenous and $H > 2$ as definitely heterogenous. In between $1\leq H \leq2$, the pooling group is said to be possibly heterogenous. The statistics $H$ does not account for intersite correlation, which could lead to underestimate the variability of the LCV and affect the interpretation of $H$. The statistics cou

PoolGroup(inter.exp, distr = 'gev', nk = 20,
          distance = distance.covar[12,], diagnostic = TRUE,
          pvalue = TRUE)

fit <- PoolGroup(inter.exp, nk = 20,
                 distance = distance.covar[12,], diagnostic = TRUE)

fit <- PoolRemove.auto(fit, tol = 1.5)
print(fit)