# kgaps_stats: Sufficient statistics for the K-gaps model In revdbayes: Ratio-of-Uniforms Sampling for Bayesian Extreme Value Analysis

## Description

Calculates sufficient statistics for the K-gaps model for the extremal index θ.

## Usage

 `1` ```kgaps_stats(data, thresh, k = 1, inc_cens = FALSE) ```

## Arguments

 `data` A numeric vector of raw data. No missing values are allowed. `thresh` A numeric scalar. Extreme value threshold applied to data. `k` A numeric scalar. Run parameter K, as defined in Suveges and Davison (2010). Threshold inter-exceedances times that are not larger than `k` units are assigned to the same cluster, resulting in a K-gap equal to zero. Specifically, the K-gap S corresponding to an inter-exceedance time of T is given by S = max(T - K, 0). `inc_cens` A logical scalar indicating whether or not to include contributions from censored inter-exceedance times relating to the first and last observation. See Attalides (2015) for details.

## Details

The sample K-gaps are S_0, S_1, ..., S_(N-1), S_N, where S_1, ..., S_(N-1) are uncensored and S_0 and S_N are censored. Under the assumption that the K-gaps are independent, the log-likelihood of the K-gaps model is given by

l(θ; S_0, ..., S_N) = N_0 log(1 - θ) + 2 N_1 log θ - θ q (S_0 + ... + S_N),

where q is the threshold exceedance probability, N_0 is the number of sample K-gaps that are equal to zero and (apart from an adjustment for the contributions of S_0 and S_N) N_1 is the number of positive sample K-gaps. Specifically, N_1 is equal to the number of S_1, ..., S_(N-1) that are positive plus (I_0 + I_N) / 2, where I_0 = 1 if S_0 is greater than zero and similarly for I_N. The differing treatment of uncensored and censored K-gaps reflects differing contributions to the likelihood. For full details see Suveges and Davison (2010) and Attalides (2015).

## Value

A list containing the sufficient statistics, with components

• `N0` : the number of zero K-gaps

• `N1` : contribution from non-zero K-gaps (see Details)

• `sum_qs` : the sum of the (scaled) K-gaps, i.e. q (S_0 + ... + S_N), where q is estimated by the proportion of threshold exceedances.

## References

Suveges, M. and Davison, A. C. (2010) Model misspecification in peaks over threshold analysis, The Annals of Applied Statistics, 4(1), 203-221. http://dx.doi.org/10.1214/09-AOAS292

Attalides, N. (2015) Threshold-based extreme value modelling, PhD thesis, University College London. http://discovery.ucl.ac.uk/1471121/1/Nicolas_Attalides_Thesis.pdf

`kgaps_mle` for maximum likelihood estimation of the extremal index θ using the K-gaps model.
 ```1 2``` ```u <- quantile(newlyn, probs = 0.90) kgaps_stats(newlyn, u) ```