hillplot: Hill Plot

Description Usage Arguments Details Value Acknowledgments Note Author(s) References See Also Examples

View source: R/hillplot.r

Description

Plots the Hill plot and some its variants.

Usage

1
2
3
4
5
6
7
8
hillplot(data, orderlim = NULL, tlim = NULL, hill.type = "Hill",
  r = 2, x.theta = FALSE, y.alpha = FALSE, alpha = 0.05,
  ylim = NULL, legend.loc = "topright",
  try.thresh = quantile(data[data > 0], 0.9, na.rm = TRUE),
  main = paste(ifelse(x.theta, "Alt", ""), hill.type, " Plot", sep = ""),
  xlab = ifelse(x.theta, "theta", "order"),
  ylab = paste(ifelse(x.theta, "Alt", ""), hill.type, ifelse(y.alpha,
  " alpha", " xi"), ">0", sep = ""), ...)

Arguments

data

vector of sample data

orderlim

vector of (lower, upper) limits of order statistics to plot estimator, or NULL to use default values

tlim

vector of (lower, upper) limits of range of threshold to plot estimator, or NULL to use default values

hill.type

"Hill" or "SmooHill"

r

smoothing factor for "SmooHill" (integer > 1)

x.theta

logical, should order (FALSE) or theta (TRUE) be given on x-axis

y.alpha

logical, should shape xi (FALSE) or tail index alpha (TRUE) be given on y-axis

alpha

significance level over range (0, 1), or NULL for no CI

ylim

y-axis limits or NULL

legend.loc

location of legend (see legend) or NULL for no legend

try.thresh

vector of thresholds to consider

main

title of plot

xlab

x-axis label

ylab

y-axis label

...

further arguments to be passed to the plotting functions

Details

Produces the Hill, AltHill, SmooHill and AltSmooHill plots, including confidence intervals.

For an ordered iid sequence X_{(1)}≥ X_{(2)}≥\cdots≥ X_{(n)} > 0 the Hill (1975) estimator using k order statistics is given by

H_{k,n}=\frac{1}{k}∑_{i=1}^{k} \log(\frac{X_{(i)}}{X_{(k+1)}})

which is the pseudo-likelihood estimator of reciprocal of the tail index ξ=/α>0 for regularly varying tails (e.g. Pareto distribution). The Hill estimator is defined on orders k>2, as whenk=1 the

H_{1,n}=0

. The function will calculate the Hill estimator for k≥ 1. The simple Hill plot is shown for hill.type="Hill".

Once a sufficiently low order statistic is reached the Hill estimator will be constant, upto sample uncertainty, for regularly varying tails. The Hill plot is a plot of

H_{k,n}

against the k. Symmetric asymptotic normal confidence intervals assuming Pareto tails are provided.

These so called Hill's horror plots can be difficult to interpret. A smooth form of the Hill estimator was suggested by Resnick and Starica (1997):

smooH_{k,n}=\frac{1}{(r-1)k}∑_{j=k+1}^{rk} H_{j,n}

giving the smooHill plot which is shown for hill.type="SmooHill". The smoothing factor is r=2 by default.

It has also been suggested to plot the order on a log scale, by plotting the points (θ, H_{\lceil n^θ\rceil, n}) for 0≤ θ ≤ 1. This gives the so called AltHill and AltSmooHill plots. The alternative x-axis scale is chosen by x.theta=TRUE.

The Hill estimator is for the GPD shape ξ>0, or the reciprocal of the tail index α=1/ξ>0. The shape is plotted by default using y.alpha=FALSE and the tail index is plotted when y.alpha=TRUE.

A pre-chosen threshold (or more than one) can be given in try.thresh. The estimated parameter (ξ or α) at each threshold are plot by a horizontal solid line for all higher thresholds. The threshold should be set as low as possible, so a dashed line is shown below the pre-chosen threshold. If the Hill estimator is similar to the dashed line then a lower threshold may be chosen.

If no order statistic (or threshold) limits are provided orderlim = tlim = NULL then the lowest order statistic is set to X_{(3)} and highest possible value X_{(n-1)}. However, the Hill estimator is always output for all k=1, …, n-1 and k=1, …, floor(n/k) for smooHill estimator.

The missing (NA and NaN) and non-finite values are ignored. Non-positive data are ignored.

The lower x-axis is the order k or θ, chosen by the option x.theta=FALSE and x.theta=TRUE respectively. The upper axis is for the corresponding threshold.

Value

hillplot gives the Hill plot. It also returns a dataframe containing columns of the order statistics, order, Hill estimator, it's standard devation and 100(1 - α)\% confidence interval (when requested). When the SmooHill plot is selected, then the corresponding SmooHill estimates are appended.

Acknowledgments

Thanks to Younes Mouatasim, Risk Dynamics, Brussels for reporting various bugs in these functions.

Note

Warning: Hill plots are not location invariant.

Asymptotic Wald type CI's are estimated for non-NULL signficance level alpha for the shape parameter, assuming exactly Pareto tails. When plotting on the tail index scale, then a simple reciprocal transform of the CI is applied which may be sub-optimal.

Error checking of the inputs (e.g. invalid probabilities) is carried out and will either stop or give warning message as appropriate.

Author(s)

Carl Scarrott carl.scarrott@canterbury.ac.nz

References

Hill, B.M. (1975). A simple general approach to inference about the tail of a distribution. Annals of Statistics 13, 331-341.

Resnick, S. and Starica, C. (1997). Smoothing the Hill estimator. Advances in Applied Probability 29, 271-293.

Resnick, S. (1997). Discussion of the Danish Data of Large Fire Insurance Losses. Astin Bulletin 27, 139-151.

See Also

hill

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
## Not run: 
# Reproduce graphs from Figure 2.4 of Resnick (1997)
data(danish, package="evir")
par(mfrow = c(2, 2))

# Hill plot
hillplot(danish, y.alpha=TRUE, ylim=c(1.1, 2))

# AltHill plot
hillplot(danish, y.alpha=TRUE, x.theta=TRUE, ylim=c(1.1, 2))

# AltSmooHill plot
hillplot(danish, hill.type="SmooHill", r=3, y.alpha=TRUE, x.theta=TRUE, ylim=c(1.35, 1.85))

# AltHill and AltSmooHill plot (no CI's or legend)
hillout = hillplot(danish, hill.type="SmooHill", r=3, y.alpha=TRUE, 
 x.theta=TRUE, try.thresh = c(), alpha=NULL, ylim=c(1.1, 2), legend.loc=NULL, lty=2)
n = length(danish)
with(hillout[3:n,], lines(log(ks)/log(n), 1/H, type="s"))

## End(Not run)

Example output

Loading required package: MASS
Loading required package: splines
Loading required package: gsl
Loading required package: SparseM

Attaching package: 'SparseM'

The following object is masked from 'package:base':

    backsolve

evmix documentation built on Sept. 3, 2019, 5:07 p.m.