Extreme value statistics on a linear scale


Fit (via linear moments), plot (on a linear scale) and compare (by goodness of fit) several (extreme value) distributions to estimate discharge at given return periods.
This package heavily relies on and thankfully acknowledges the package lmomco by WH Asquith.
Open the Vignette for an introduction to the package. vignette("extremeStat")


The common object to share between functions is a list (dlf) with:

dat numeric vector with (extreme) values
datname character string for main, xlab etc
gofProp number between 0 and 1; upper proportion of dat to compute goodness of fit from
parameter list (usually of length 17 if speed=TRUE) with parameters of each distribution
gof dataframe with 'Goodness of Fit' measures, sorted by RMSE of theoretical and empirical cumulated density
returnlev dataframe with values of distributions for given return periods (RPs). This element is only added in distLextreme
RP___ Return periods according to plotting positions, see below.
coldist Colors for plotting, added in distLplot
truncate Truncation percentage, only relevant for distLquantile
quant Quantile estimation from distLquantile

It can be printed with distLprint, which may be transformed to a class with printing method.
Plotting positions are not used for fitting distributions, but for plotting only
The ranks of ascendingly sorted extreme values are used to compute the probability of non-exceedence Pn:
Pn_w <- Rank /(n+1) # Weibull
Pn_g <- (Rank-0.44)/(n+0.12) # Gringorton (taken from lmom:::evplot.default)
Finally: RP = Returnperiod = recurrence interval = 1/P_exceedence = 1/(1-P_nonexc.), thus:
RPweibull = 1/(1-Pn_w) and analogous for gringorton.

The main functions in the extremeStat package are:

distLextreme analyse extreme value statistics, calls distLfit and distLextremePlot.
distLextremePlot plot distribution lines and plotting positions.
distLfit fit the parameters, calls gof and distLplot.
distLplot plot density or cumulated density of data and distributions.
distLgof calculate goodness of fits, calls distLgofPlot. Can also be executed with dlf to minimize computing time by not fitting the parameters again.
distLgofPlot compare distribution ranks of different distLgof methods.
distLquantile compute parametric quantile estimates. Calls distLfit.

Depends on 'berryFunctions' for rmse, rsquare, logAxis, logVals.
Suggests 'pbapply' to see progress bars if you have large (n > 1e3) datasets.
At some places you will find ## not run in the examples. These code blocks were excluded from checking while building, mainly because they are computationally intensive and should not take so much of CRANs resources. Normally, you should be able to run them in an interactive session.
If you do find unexecutable code, please tell me!
This package was motivated by my need to compare the fits of several distributions to data. It was originally triggered by a flood estimation assignment we had in class 2012, and it bothered me that we just assumed the gumbel distribution would fit the data fine.
With the updated form of the original function, I think this is a useful package to compare fits.
I am no expert on distributions, so I welcome all suggestions you might have for me.


Berry Boessenkool, berry-b@gmx.de, 2014-2016

See Also

If you are looking for more detailed (uncertainty) analysis, eg confidence intervals, check out the package extRemes, especially the function fevd. http://cran.r-project.org/package=extRemes
Intro slides: http://sites.lsa.umich.edu/eva2015/wp-content/uploads/sites/44/2015/06/Intro2EVT.pdf
Parameter fitting and distribution functions: http://cran.r-project.org/package=lmomco
Distributions: https://www.rmetrics.org/files/Meielisalp2009/Presentations/Scott.pdf and: http://cran.r-project.org/web/views/Distributions.html
R in Hydrology: http://abouthydrology.blogspot.de/2012/08/r-resources-for-hydrologists.html


data(annMax) # annual discharge maxima from a stream in Austria
plot(annMax, type="l")
dle <- distLextreme(annMax)
comments powered by Disqus