The exdex
package is a tool for estimating the extremal index.
The extremal index has several interpretations and leading to different models/methods by which inferences about $\theta$ can be made. Here we consider a model based on the behaviour of occurrences of exceedances of a high threshold. The K-gaps model of @SD2010 extends the model of @FS2003 by incorporating a run length parameter $K$. Under this model threshold inter-exceedance times not larger than $K$ are part of the same cluster and other inter-exceedance times have an exponential distribution with rate parameter $\theta$. Thus, $\theta$ has dual role as the probability that a process leaves one cluster of threshold exceedances and as the reciprocal of the mean time until the process enters the next cluster. For details see @SD2010.
The exdex package provides functions for performing maximum likelihood and Bayesian inferences about $\theta$ under the K-gaps model. We illustrate the code using the newlyn
dataset, which is analysed in @FW2012. For the sake of illustration we use the default setting, $K = 1$, which may not be appropriate for these data. See @SD2010 for discussion of this issue and for methodology to inform the choice of $K$.
The function kgaps_mle
finds the maximum likelihood estimate and its estimated standard error and (optionally) a conf
% level likelihood-based confidence interval for $\theta$.
library(exdex) # Set a threshold at the 90% quantile thresh <- quantile(newlyn, probs = 0.90) # MLE, SE and 95% likelihood-based confidence interval mle <- kgaps_mle(newlyn, thresh, conf = 95) mle$theta_mle mle$theta_se mle$theta_ci
The function threshplot
calculates maximum likelihood estimates of the extremal index $\theta$ based on the K-gaps model for threshold inter-exceedances times of
@SD2010.
library(exdex) # Set a number for the minimum and the maximum quantile level threshplot( newlyn, tmin = 40, tmax = 80)
Plots generated by threshplot
are, by design, very basic. However, the user can easily make changes.
threshplot(newlyn, 50, 60, # Change the confidence interval conf = 50, # Change the line type lty = c(2, 1, 2), # Change the colours of the lines col = c(1, 5, 2), # Change the label names for X and Y axes xlab = "Any name", ylab = "Any name", # Change the type of plot type = "p", # Change the symbol used pch = 19 )
This method is using only one parameter, $\theta$ and is not making full parametric assumptions about distributions. In particular, we use only the approximate relationship $G=F^{b\theta}$ between $G$ and $F$. This relationship is a consequence of Theorem 5.2 in @Coles2001. $F$ is usually unknown, therefore we estimate the extremal index using the empirical c.d.f.. In order to do that we use the sliding block maxima. The reason is that by using the sliding block maxima we produce a sample $Y^s$ that contains more information about $\theta$ than the subsample $Y^d$ produced by the disjoint block maxima. Thus $Y^s$ produces a more efficient estimator of $\theta$. Since we do not make any specific assumptions about the distributions and we do not have to estimate the five other parameters then $\theta$ is estimated directly. A more detailed account of this theory can be found in @Northrop2015.
The function spm_mle
finds the maximum likelihood estimate and its estimated standard error and (optionally) a conf
% level likelihood-based confidence interval for $\theta$.
library(exdex) # Decide on the block size and indicate whether to use sliding blocks or disjoint blocks # MLE, SE and 95% likelihood-based confidence interval spm_mle(newlyn, 20, sliding = FALSE, conf = 95) mle$theta_mle mle$theta_se mle$theta_ci
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.