Extreme value modelling
Description
Likelihood based modelling and inference for extreme value models, possibly with explanatory variables.
Usage
1 2 3 4 5 6 7 8 9  evm(y, data, family = gpd, ...)
## Default S3 method:
evm(y, data, family = gpd, th = Inf, qu, ...,
penalty = NULL, prior = "gaussian", method = "optimize",
cov = "observed", start = NULL, priorParameters = NULL, maxit = 10000,
trace = NULL, iter = 40500, burn = 500, thin = 4,
proposal.dist = c("gaussian", "cauchy"), jump.cov, jump.const = NULL,
R = 1000, cores = NULL, verbose = TRUE)

Arguments
y 
Either a numeric vector or the name of a variable in 
data 
A data frame containing 
family 
An object of class 'texmexFamily'. Defaults to

... 
In 
th 
For threshold excess models (such as when 
qu 
An alternative to 
penalty 
How to penalize the likelhood. Currently, either "none"",
"gaussian"" or "lasso"" are the only allowed values. If 
prior 
If 
method 
Should be either "optimize" (the default), "simulate" or
"bootstrap". The first letter or various abbreviations will do. If
'optimize' is used, the (penalized) likelihood is directly optimized using

cov 
How to compute the covariance matrix of the parameters. Defaults
to In some cases, particularly with small samples, the numerical approximation
can be quite different from the closed form ( 
start 
Starting values for the parameters, to be passed to

priorParameters 
A list with two components. The first should be a
vector of means, the second should be a covariance matrix if the
penalty/prior is "gaussian" or "quadratic" and a diagonal precision matrix
if the penalty/prior is "lasso", "L1" or "Laplace". If 
maxit 
The number of iterations allowed in 
trace 
Whether or not to print progress to screen. If 
iter 
Number of simulations to generate under 
burn 
The number of initial steps to be discarded. Defaults to 500. 
thin 
The degree of thinning of the resulting Markov chains. Defaults to 4 (one in every 4 steps is retained). 
proposal.dist 
The proposal distribution to use, either multivariate gaussian or a multivariate Cauchy. 
jump.cov 
Covariance matrix for proposal distribution of Metropolis
algorithm. This is scaled by 
jump.const 
Control parameter for the Metropolis algorithm. 
R 
The number of parametric bootstrap samples to run when 
cores 
The number of cores to use when bootstrapping. Defaults to

verbose 
Whether or not to print progress to screen. Defaults to

Details
The main modelling function is evm
(extreme value model) and the
distribution to be used is specified by passing an object of class
texmexFamily
to the family
argument.
The default texmexFamily
object used by evm
is gpd
.
Currently, the other texmexFamily
objects available are gev
which results in fitting a generalized extreme value (GEV) distribution to
the data, and egp3
which fits the extended generalized Pareto
distribution version 3 of Papastathopoulos and Tawn (2013).
See Coles (2001) for an introduction to extreme value modelling and the GPD and GEV models.
For the GPD model, we use the following parameterisation of evm:
P(Y ≤ y) = 1  ≤ft(1 + \frac{ξ y}{σ}\right)^{1/ξ}
for y ≥ 0 and 1 + ξ y / σ ≥ 0.
For the GEV model, we use:
P(Y ≤ y) = \exp ≤ft\{ ≤ft[ 1 + ξ ≤ft( \frac{y  μ}{σ}\right) \right] ^{1/ξ} \right\}
In each case, the scale parameter is sigma (σ) and the shape parameter is xi (ξ). The GEV distribution also has location parameter mu (μ). See Papastathopoulos and Tawn (2013) for specification of the EGP3 model.
Working with the log of the scale parameter improves the stability of computations, makes a quadratic penalty more appropriate and enables the inclusion of covariates in the model for the scale parameter, which must remain positive. We therefore work with φ=log(σ). All specification of priors or penalty functions refer to φ rather than σ. A quadratic penalty can be thought of as a Gaussian prior distribution, whence the terminology of the function.
Parameters of the evm are estimated by using maximum (penalized) likelihood
(method = "optimize"
), or by simulating from the posterior
distribution of the model parameters using a Metropolis algorithm
(method = "simulate"
). In the latter case, start
is used as a
starting value for the Metropolis algorithm; in its absence, the maximum
penalized likelhood point estimates are computed and used.
A boostrap approach is also available (method = "bootstrap"
). This
runs a parametric bootstrap, simulating from the model fit by optimization.
When method = "simulate"
the print
and summary
functions give posterior means and standard deviations. Posterior means are
also returned by the coef
method. Depending on what you want to do
and what the posterior distributions look like (use plot
method) you
might want to work with quantiles of the posterior distributions instead of
relying on standard errors.
When method = "bootstrap"
, summaries of the bootstrap distribution
and the bootstrap estimate of bias are displayed.
Value
If method = "optimize"
, an object of class evmOpt
:
call 
The call to 
data 
The original data (above and below the threshold for fitting if
a distribution for threshold excesses has been used). In detail, 
convergence 
Output from

message 
A message telling the user whether or not convergence was achieved. 
threshold 
The threshold of the data above which the evmSim model was fit. 
penalty 
The type of penalty function used, if any. 
coefficients 
The parameter estimates as computed under maximum likelihood or maximum penalized likelihood. 
rate 
The proportion of observations above the threshold. If the model is not a threshold exceedance model (e.g. the GEV model), the rate will be 1. 
priorParameters 
See above. 
residuals 
Residuals computed using the residual function in the

ploglik 
The value of the optimized penalized loglikelihood. 
loglik 
The value of the
optimized (unpenalized) loglikelihood. If 
cov 
The estimated covariance of the parameters in the model. 
se 
The estimated standard errors of the parameters in the model. 
xlevels 
A named list
containing a named list for each design matrix (main parameter) in the
model. Each list contians an element named after each factor in the linear
predictor for the respective design matrix. These are used by the

If method = "simulate"
, an object of class evmSim
:
call 
The call to 
threshold 
The threshold above which the model was fit. 
map 
The point estimates found by maximum penalized likelihood and
which were used as the starting point for the Markov chain. This is of
class 
burn 
The number of steps of the Markov chain that are to be treated as the burnin and not used in inferences. 
thin 
The degree of thinning used. 
chains 
The entire Markov chain generated by the Metropolis algorithm. 
y 
The response data above the threshold for fitting. 
seed 
The seed used by the random number generator. 
param 
The remainder of the chain after deleting the burnin and applying any thinning. 
If method = "bootstrap"
, an object of class evmBoot
:
call 
The call to 
replicates 
The parameter estimates from the bootstrap fits. 
map 
The fit by by maximum penalized likelihood to the orginal data.
This is of class 
There are summary, plot, print, residuals and coefficients methods available for these classes.
Note
For both GPD and GEV models, when there are estimated values of ξ ≤ 0.5, the regularity conditions of the likelihood break down and inference based on approximate standard errors cannot be performed. In this case, the most fruitful approach to inference appears to be by the bootstrap. It might be possible to simulate from the posterior, but finding a good proposal distribution might be difficult and you should take care to get an acceptance rate that is reasonably high (around 40% when there are no covariates, lower otherwise).
Author(s)
Janet E. Heffernan, Harry Southworth. Some of the internal code is
based on the gpd.fit
function in the ismev
package and is due
to Stuart Coles.
References
S. Coles. An Introduction to Statistical Modelling of Extreme Values. Springer, 2001.
I. Papastathopoulos and J. A. Tawn, Extended generalised Pareto models for tail estimation, Journal of Statistical Planning and Inference, 143, 131  143, 2013.
See Also
rl.evmOpt
, predict.evmOpt
,
evm.declustered
.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 