fit.gev: Improved maximum-likelihood fit of the GEV distribution

View source: R/opti.R

fit.gevR Documentation

Improved maximum-likelihood fit of the GEV distribution

Description

This function fits the Generalized Extreme Value (GEV) distribution to the supplied data, which has to be composed of block maxima (preferably without trend and correlations). The determination of the starting point of the optimization and the calculation of the return level and the all the corresponding estimates of the fitting errors will be done internally.

Usage

fit.gev(x, initial = NULL, likelihood.function = likelihood,
  gradient.function = likelihood.gradient, error.estimation = c("MLE",
  "MC", "bootstrap", "none"), monte.carlo.sample.size = 100,
  bootstrap.sample.size = 100, return.period = 100,
  extreme.type = c("max", "min"), blocking = FALSE,
  block.number = NULL, block.length = NULL, silent = TRUE,
  debug = FALSE, mc.cores = NULL, ...)

Arguments

x

Either an object of class xts or a list of those. The time series can be provided in two different formats: as a series of block maxima (possibly calculated using the block function) and the blocking argument set to FALSE or as the raw time series with blocking set to TRUE. In the later case the blocking will be performed inside the fit.gev function.

initial

Initial values for the GEV parameters. Has to be provided as 3x1 vector. If NULL the parameters are estimated using likelihood.initials. If the shape parameter is set to 0 the exponential distribution instead of the GEV one is fitted. But this its strongly discouraged to do so! Default = NULL.

likelihood.function

Function, which is going to be optimized. Default: likelihood

gradient.function

If NULL a finite difference method is invoked. Default: likelihood.gradient

error.estimation

Method for calculating the standard errors of the fitted results. The errors of the GEV parameters will be calculated as the square roots of the diagonal elements of the inverse of the hessian matrix. The latter will be evaluated at the maximum likelihood estimates (MLE) of the GEV parameters.

MLE: The standard error of the return level is calculated using the Delta method and the maximum likelihood estimates of the GEV parameters. Note: For positive shape parameters bigger than 0.3 this approach tends to highly overestimates the errors of the return levels.

MC: Alternative one can use a Monte Carlo method for which monte.carlo.sample.size samples of the same size as x will be drawn from a GEV distribution constituted by the obtained MLE of the GEV parameters of x. The standard error is then calculated via the square of the variance of all fitted GEV parameters and calculated return levels. Note: In its essence this approach is not an estimation of the error involved in fitting the time series to a GEV distribution. It is rather the mean error of fitting a GEV distribution with the same length and parameters as estimated ones.

bootstrap: Using this option the provided time series x will be sampled with replacement bootstrap.sample.size times and with the same length as the original time series. The standard errors of the GEV parameters and return levels of all those sampled series is calculated and returned as an estimate of the fitting error. Note: Since the data is (hopefully) GEV-distributed, such a sampling has to be treated with a lot of care.

Sometimes the inversion of the hessian fails (since the are some NaN in the hessian) when calculating the error estimates using the maximum likelihood approach (MLE) (which is also the reason why the ismev package occasionally does not work). In such cases the Monte Carlo (MC) method is used as a fallback.

none skips the calculation of the error. Default = "MLE".

monte.carlo.sample.size

Number of samples used to obtain the Monte Carlo estimate of the standard error of the fitting. Default = 100.

bootstrap.sample.size

Number of samples with replacements to drawn from the original series x in order to determine the standard errors for the GEV parameters and return levels. Default = 100.

return.period

Quantiles at which the return level is going to be evaluated. Class "numeric". Default = 100.

extreme.type

String specifying whether the maxima ("max") or minima ("min") of each block should be fitted. If the minima are chosen, the input x has to be either a time series of block minima with the blocking argument set to zero or the raw series and fit.gev function will handle the extraction of the minima internally. For the minima the resulting extremes are multiplied by -1 and fitted using a default GEV distribution. The resulting scale and shape parameter are handed back without any additional changes and the location parameter and return levels are multiplied by -1. Default = "max".

blocking

Logical value indicating whether or not the input data x should be split into blocks of equal size using the block function and only the maxima or minima should be extracted. If any of the arguments block.number or block.length is provided with a non NULL value, the blocking will be performed regardless of the value of the blocking argument. If, on the other hand, blocking is TRUE but neither block.number nor block.length was provided, the time series will be split into annual blocks. Default = FALSE.

block.number

Specifies the number of blocks the input data is going to be separated in. Default = NULL.

block.length

Length of the blocks. For the sake of simplicity the last block is not forced to match the length of the other plots. Default = NULL.

silent

Determines whether or not warning messages shall be displayed and results shall be reported. Default = FALSE.

debug

If TRUE the routine will display debugging information featuring intermediate steps of the constrained optimization. Default = FALSE.

mc.cores

A numerical input specifying the number of cores to use for the multi core application of the function (see detectCores). This functionality is only available if the input is a list of different objects. If NULL, the function will be calculated classically with only one core. Default = NULL.

...

Additional arguments for the optim function.

Details

The optimization is performed by the augmented Lagrangian method using the auglag function of the alabama package. Within this framework the log-likelihood function of the GEV distribution gets augmented with N+2 constraints, where N is the number of points in the time series. N+1 of those constraints ensure the log-likelihood (containing two logarithms) to be always defined. The remaining constraints ensures for the shape parameter to be always bigger than -1 for the maximum likelihood to be defined in the first place. The penalty in the log-likelihood function is the sum of all squared constrain violations plus an additional term linear in the constraint violation to ensure well-conditioning. Using this penalty term the problem becomes unconstrained again and can be solved using optim. After each of those inner routines the weighting parameter of the penalty is being increased until some convergence conditions are fulfilled.

Since it usually takes just four to five outer iterations this functions needs only double the time a pure call to the optim function would need.

The negative log-likelihood of the Gumbel distribution is just fitted if the shape parameter is exactly equal to zero.

If the user instead wants to fit just the Gumbel distribution and not the entire GEV distribution, the shape parameter of the initial has to be set to 0. But in practice this is strongly discouraged since it will yield inferior results.

I found the Nelder-Mead method to be more robust to starting points more far away from the global optimum. This also holds for the inner routine of the augmented Lagrangian method. Since other routines, like CG and BFGS only cause problems in the extreme value analysis, there won't be an option to choose them in this package.

This function can also be applied to a list of xts class objects.

Value

Output of the optim function with class c( "list", "climex.fit.gev" )

  • par = MLE of the GEV parameters

  • value = Value of the negative log-likelihood evaluated at the MLE

  • counts = Number of evaluations of the likelihood function and its gradient during optimization (inner routine)

  • outer.iteration = Number of updates of the penalty and the Lagrangian parameter to fine-tune the impact of the constraints on the optimization (outer routine)

  • return.level = Estimate of the return levels at the provided return periods

  • se = Standard error of the GEV parameters and the return levels

  • x = Original time series

  • control = Parameter and options used during optimization

If, on the other hand, a list of xts class object was provided, a list of objects structured as describe above is returned.

Author(s)

Philipp Mueller

See Also

Other optimization: fit.gev.default, fit.gev.list, fit.gev.xts, fit.gpd.default, fit.gpd.list, fit.gpd.xts, fit.gpd, likelihood.augmented, likelihood.gradient.augmented, likelihood.gradient, likelihood.initials, likelihood

Examples

  potsdam.anomalies <- anomalies( temp.potsdam )
  potsdam.blocked <- block( potsdam.anomalies )
  fit.gev( potsdam.blocked )

theGreatWhiteShark/climex documentation built on July 13, 2022, 9:11 a.m.