View source: R/family.extremes.R

gpd | R Documentation |

Maximum likelihood estimation of the 2-parameter generalized Pareto distribution (GPD).

```
gpd(threshold = 0, lscale = "loglink", lshape = logofflink(offset = 0.5),
percentiles = c(90, 95), iscale = NULL, ishape = NULL,
tolshape0 = 0.001, type.fitted = c("percentiles", "mean"),
imethod = 1, zero = "shape")
```

`threshold` |
Numeric, values are recycled if necessary.
The threshold value(s), called |

`lscale` |
Parameter link function for the scale parameter |

`lshape` |
Parameter link function for the shape parameter For the shape parameter,
the default |

`percentiles` |
Numeric vector of percentiles used
for the fitted values. Values should be between 0 and 100.
See the example below for illustration.
This argument is ignored if |

`type.fitted` |
See |

`iscale, ishape` |
Numeric. Optional initial values for |

`tolshape0` |
Passed into |

`imethod` |
Method of initialization, either 1 or 2. The first is the method of
moments, and the second is a variant of this. If neither work, try
assigning values to arguments |

`zero` |
Can be an integer-valued vector specifying which
linear/additive predictors are modelled as intercepts only.
For one response, the value should be from the set {1,2}
corresponding respectively to |

The distribution function of the GPD can be written

`G(y) = 1 - [1 + \xi (y-\mu) / \sigma ]_{+}^{- 1/ \xi} `

where
`\mu`

is the location parameter
(known, with value `threshold`

),
`\sigma > 0`

is the scale parameter,
`\xi`

is the shape parameter, and
`h_+ = \max(h,0)`

.
The function `1-G`

is known as the *survivor function*.
The limit `\xi \rightarrow 0`

gives the *shifted exponential* as a special case:

`G(y) = 1 - \exp[-(y-\mu)/ \sigma]. `

The support is `y>\mu`

for `\xi>0`

,
and
`\mu < y <\mu-\sigma / \xi`

for `\xi<0`

.

Smith (1985) showed that if `\xi <= -0.5`

then
this is known as the nonregular case and problems/difficulties
can arise both theoretically and numerically. For the (regular)
case `\xi > -0.5`

the classical asymptotic
theory of maximum likelihood estimators is applicable; this is
the default.

Although for `\xi < -0.5`

the usual asymptotic properties
do not apply, the maximum likelihood estimator generally exists and
is superefficient for `-1 < \xi < -0.5`

, so it is
“better” than normal.
When `\xi < -1`

the maximum
likelihood estimator generally does not exist as it effectively becomes
a two parameter problem.

The mean of `Y`

does not exist unless `\xi < 1`

, and
the variance does not exist unless `\xi < 0.5`

. So if
you want to fit a model with finite variance use `lshape = "extlogitlink"`

.

An object of class `"vglmff"`

(see `vglmff-class`

).
The object is used by modelling functions such as `vglm`

and `vgam`

.
However, for this VGAM family function, `vglm`

is probably preferred over `vgam`

when there is smoothing.

Fitting the GPD by maximum likelihood estimation can be numerically
fraught. If `1 + \xi (y-\mu)/ \sigma \leq 0`

then some crude evasive action is taken but the estimation process
can still fail. This is particularly the case if `vgam`

with `s`

is used. Then smoothing is best done with
`vglm`

with regression splines (`bs`

or `ns`

) because `vglm`

implements
half-stepsizing whereas `vgam`

doesn't. Half-stepsizing
helps handle the problem of straying outside the parameter space.

The response in the formula of `vglm`

and `vgam`

is `y`

.
Internally, `y-\mu`

is computed.
This VGAM family function can handle a multiple
responses, which is inputted as a matrix.
The response stored on the object is the original uncentred data.

With functions `rgpd`

, `dgpd`

, etc., the
argument `location`

matches with the argument `threshold`

here.

T. W. Yee

Yee, T. W. and Stephenson, A. G. (2007).
Vector generalized linear and additive extreme value models.
*Extremes*, **10**, 1–19.

Coles, S. (2001).
*An Introduction to Statistical Modeling of Extreme Values*.
London: Springer-Verlag.

Smith, R. L. (1985).
Maximum likelihood estimation in a class of nonregular cases.
*Biometrika*, **72**, 67–90.

`rgpd`

,
`meplot`

,
`gev`

,
`paretoff`

,
`vglm`

,
`vgam`

,
`s`

.

```
# Simulated data from an exponential distribution (xi = 0)
Threshold <- 0.5
gdata <- data.frame(y1 = Threshold + rexp(n = 3000, rate = 2))
fit <- vglm(y1 ~ 1, gpd(threshold = Threshold), data = gdata, trace = TRUE)
head(fitted(fit))
summary(depvar(fit)) # The original uncentred data
coef(fit, matrix = TRUE) # xi should be close to 0
Coef(fit)
summary(fit)
head(fit@extra$threshold) # Note the threshold is stored here
# Check the 90 percentile
ii <- depvar(fit) < fitted(fit)[1, "90%"]
100 * table(ii) / sum(table(ii)) # Should be 90%
# Check the 95 percentile
ii <- depvar(fit) < fitted(fit)[1, "95%"]
100 * table(ii) / sum(table(ii)) # Should be 95%
## Not run: plot(depvar(fit), col = "blue", las = 1,
main = "Fitted 90% and 95% quantiles")
matlines(1:length(depvar(fit)), fitted(fit), lty = 2:3, lwd = 2)
## End(Not run)
# Another example
gdata <- data.frame(x2 = runif(nn <- 2000))
Threshold <- 0; xi <- exp(-0.8) - 0.5
gdata <- transform(gdata, y2 = rgpd(nn, scale = exp(1 + 0.1*x2), shape = xi))
fit <- vglm(y2 ~ x2, gpd(Threshold), data = gdata, trace = TRUE)
coef(fit, matrix = TRUE)
## Not run: # Nonparametric fits
# Not so recommended:
fit1 <- vgam(y2 ~ s(x2), gpd(Threshold), data = gdata, trace = TRUE)
par(mfrow = c(2, 1))
plot(fit1, se = TRUE, scol = "blue")
# More recommended:
fit2 <- vglm(y2 ~ sm.bs(x2), gpd(Threshold), data = gdata, trace = TRUE)
plot(as(fit2, "vgam"), se = TRUE, scol = "blue")
## End(Not run)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.