View source: R/family.extremes.R

gpd | R Documentation |

Maximum likelihood estimation of the 2-parameter generalized Pareto distribution (GPD).

gpd(threshold = 0, lscale = "loglink", lshape = logofflink(offset = 0.5), percentiles = c(90, 95), iscale = NULL, ishape = NULL, tolshape0 = 0.001, type.fitted = c("percentiles", "mean"), imethod = 1, zero = "shape")

`threshold` |
Numeric, values are recycled if necessary.
The threshold value(s), called |

`lscale` |
Parameter link function for the scale parameter |

`lshape` |
Parameter link function for the shape parameter For the shape parameter,
the default |

`percentiles` |
Numeric vector of percentiles used
for the fitted values. Values should be between 0 and 100.
See the example below for illustration.
This argument is ignored if |

`type.fitted` |
See |

`iscale, ishape` |
Numeric. Optional initial values for |

`tolshape0` |
Passed into |

`imethod` |
Method of initialization, either 1 or 2. The first is the method of
moments, and the second is a variant of this. If neither work, try
assigning values to arguments |

`zero` |
Can be an integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
For one response, the value should be from the set {1,2} corresponding
respectively to |

The distribution function of the GPD can be written

*
G(y) = 1 - [1 + xi (y-mu)/ sigma ]_{+}^{- 1/ xi} *

where
*mu* is the location parameter
(known, with value `threshold`

),
*sigma > 0* is the scale parameter,
*xi* is the shape parameter, and
*h_+ = max(h,0)*.
The function *1-G* is known as the *survivor function*.
The limit *xi --> 0*
gives the *shifted exponential* as a special case:

*
G(y) = 1 - exp[-(y-mu)/ sigma]. *

The support is *y>mu* for *xi>0*,
and
*mu < y <mu-sigma / xi* for *xi<0*.

Smith (1985) showed that if *xi <= -0.5* then
this is known as the nonregular case and problems/difficulties
can arise both theoretically and numerically. For the (regular)
case *xi > -0.5* the classical asymptotic
theory of maximum likelihood estimators is applicable; this is
the default.

Although for *xi < -0.5* the usual asymptotic properties
do not apply, the maximum likelihood estimator generally exists and
is superefficient for *-1 < xi < -0.5*, so it is
“better” than normal.
When *xi < -1* the maximum
likelihood estimator generally does not exist as it effectively becomes
a two parameter problem.

The mean of *Y* does not exist unless *xi < 1*, and
the variance does not exist unless *xi < 0.5*. So if
you want to fit a model with finite variance use `lshape = "extlogitlink"`

.

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as `vglm`

and `vgam`

.
However, for this VGAM family function, `vglm`

is probably preferred over `vgam`

when there is smoothing.

Fitting the GPD by maximum likelihood estimation can be numerically
fraught. If *1 + xi*(y-mu)/sigma <=
0* then some crude evasive action is taken but the estimation process
can still fail. This is particularly the case if `vgam`

with `s`

is used. Then smoothing is best done with
`vglm`

with regression splines (`bs`

or `ns`

) because `vglm`

implements
half-stepsizing whereas `vgam`

doesn't. Half-stepsizing
helps handle the problem of straying outside the parameter space.

The response in the formula of `vglm`

and `vgam`

is *y*.
Internally, *y-mu* is computed.
This VGAM family function can handle a multiple
responses, which is inputted as a matrix.
The response stored on the object is the original uncentred data.

With functions `rgpd`

, `dgpd`

, etc., the
argument `location`

matches with the argument `threshold`

here.

T. W. Yee

Yee, T. W. and Stephenson, A. G. (2007).
Vector generalized linear and additive extreme value models.
*Extremes*, **10**, 1–19.

Coles, S. (2001).
*An Introduction to Statistical Modeling of Extreme Values*.
London: Springer-Verlag.

Smith, R. L. (1985).
Maximum likelihood estimation in a class of nonregular cases.
*Biometrika*, **72**, 67–90.

`rgpd`

,
`meplot`

,
`gev`

,
`paretoff`

,
`vglm`

,
`vgam`

,
`s`

.

# Simulated data from an exponential distribution (xi = 0) Threshold <- 0.5 gdata <- data.frame(y1 = Threshold + rexp(n = 3000, rate = 2)) fit <- vglm(y1 ~ 1, gpd(threshold = Threshold), data = gdata, trace = TRUE) head(fitted(fit)) summary(depvar(fit)) # The original uncentred data coef(fit, matrix = TRUE) # xi should be close to 0 Coef(fit) summary(fit) head(fit@extra$threshold) # Note the threshold is stored here # Check the 90 percentile ii <- depvar(fit) < fitted(fit)[1, "90%"] 100 * table(ii) / sum(table(ii)) # Should be 90% # Check the 95 percentile ii <- depvar(fit) < fitted(fit)[1, "95%"] 100 * table(ii) / sum(table(ii)) # Should be 95% ## Not run: plot(depvar(fit), col = "blue", las = 1, main = "Fitted 90% and 95% quantiles") matlines(1:length(depvar(fit)), fitted(fit), lty = 2:3, lwd = 2) ## End(Not run) # Another example gdata <- data.frame(x2 = runif(nn <- 2000)) Threshold <- 0; xi <- exp(-0.8) - 0.5 gdata <- transform(gdata, y2 = rgpd(nn, scale = exp(1 + 0.1*x2), shape = xi)) fit <- vglm(y2 ~ x2, gpd(Threshold), data = gdata, trace = TRUE) coef(fit, matrix = TRUE) ## Not run: # Nonparametric fits # Not so recommended: fit1 <- vgam(y2 ~ s(x2), gpd(Threshold), data = gdata, trace = TRUE) par(mfrow = c(2, 1)) plot(fit1, se = TRUE, scol = "blue") # More recommended: fit2 <- vglm(y2 ~ sm.bs(x2), gpd(Threshold), data = gdata, trace = TRUE) plot(as(fit2, "vgam"), se = TRUE, scol = "blue") ## End(Not run)

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.