gev: Generalized Extreme Value Regression Family Function

Description Usage Arguments Details Value Warning Note Author(s) References See Also Examples

View source: R/family.extremes.R

Description

Maximum likelihood estimation of the 3-parameter generalized extreme value (GEV) distribution.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
gev(llocation = "identitylink", lscale = "loglink",
    lshape = logofflink(offset = 0.5), percentiles = c(95, 99),
    ilocation = NULL, iscale = NULL, ishape = NULL, imethod = 1,
    gprobs.y = (1:9)/10, gscale.mux = exp((-5:5)/6),
    gshape = (-5:5) / 11 + 0.01,
    iprobs.y = NULL, tolshape0 = 0.001,
    type.fitted = c("percentiles", "mean"),
    zero = c("scale", "shape"))
gevff(llocation = "identitylink", lscale = "loglink",
    lshape = logofflink(offset = 0.5), percentiles = c(95, 99),
    ilocation = NULL, iscale = NULL, ishape = NULL, imethod = 1,
    gprobs.y = (1:9)/10, gscale.mux = exp((-5:5)/6),
    gshape = (-5:5) / 11 + 0.01,
    iprobs.y = NULL, tolshape0 = 0.001,
    type.fitted = c("percentiles", "mean"), zero = c("scale", "shape"))

Arguments

llocation, lscale, lshape

Parameter link functions for mu, sigma and xi respectively. See Links for more choices.

For the shape parameter, the default logofflink link has an offset called A below; and then the linear/additive predictor is log(xi+A) which means that xi > -A. For technical reasons (see Details) it is a good idea for A = 0.5.

percentiles

Numeric vector of percentiles used for the fitted values. Values should be between 0 and 100. This argument is ignored if type.fitted = "mean".

type.fitted

See CommonVGAMffArguments for information. The default is to use the percentiles argument. If "mean" is chosen, then the mean mu + sigma * (gamma(1-xi)-1)/xi is returned as the fitted values, and these are only defined for xi<1.

ilocation, iscale, ishape

Numeric. Initial value for the location parameter, sigma and xi. A NULL means a value is computed internally. The argument ishape is more important than the other two. If a failure to converge occurs, or even to obtain initial values occurs, try assigning ishape some value (positive or negative; the sign can be very important). Also, in general, a larger value of iscale tends to be better than a smaller value.

imethod

Initialization method. Either the value 1 or 2. If both methods fail then try using ishape. See CommonVGAMffArguments for information.

gshape

Numeric vector. The values are used for a grid search for an initial value for xi. See CommonVGAMffArguments for information.

gprobs.y, gscale.mux, iprobs.y

Numeric vectors, used for the initial values. See CommonVGAMffArguments for information.

tolshape0

Passed into dgev when computing the log-likelihood.

zero

A specifying which linear/additive predictors are modelled as intercepts only. The values can be from the set {1,2,3} corresponding respectively to mu, sigma, xi. If zero = NULL then all linear/additive predictors are modelled as a linear combination of the explanatory variables. For many data sets having zero = 3 is a good idea. See CommonVGAMffArguments for information.

Details

The GEV distribution function can be written

G(y) = exp( -[ (y- mu)/ sigma ]_{+}^{- 1/ xi})

where sigma > 0, -Inf < mu < Inf, and 1 + xi*(y-mu)/sigma > 0. Here, x_+ = max(x,0). The mu, sigma, xi are known as the location, scale and shape parameters respectively. The cases xi>0, xi<0, xi = 0 correspond to the Frechet, reverse Weibull, and Gumbel types respectively. It can be noted that the Gumbel (or Type I) distribution accommodates many commonly-used distributions such as the normal, lognormal, logistic, gamma, exponential and Weibull.

For the GEV distribution, the kth moment about the mean exists if xi < 1/k. Provided they exist, the mean and variance are given by mu + sigma { Gamma(1-xi)-1} / xi and sigma^2 { Gamma(1-2 xi) - Gamma^2 (1- xi) } / xi^2 respectively, where Gamma is the gamma function.

Smith (1985) established that when xi > -0.5, the maximum likelihood estimators are completely regular. To have some control over the estimated xi try using lshape = logofflink(offset = 0.5), say, or lshape = extlogitlink(min = -0.5, max = 0.5), say.

Value

An object of class "vglmff" (see vglmff-class). The object is used by modelling functions such as vglm, and vgam.

Warning

Currently, if an estimate of xi is too close to 0 then an error may occur for gev() with multivariate responses. In general, gevff() is more reliable than gev().

Fitting the GEV by maximum likelihood estimation can be numerically fraught. If 1 + xi*(y-mu)/sigma <= 0 then some crude evasive action is taken but the estimation process can still fail. This is particularly the case if vgam with s is used; then smoothing is best done with vglm with regression splines (bs or ns) because vglm implements half-stepsizing whereas vgam doesn't (half-stepsizing helps handle the problem of straying outside the parameter space.)

Note

The VGAM family function gev can handle a multivariate (matrix) response, cf. multiple responses. If so, each row of the matrix is sorted into descending order and NAs are put last. With a vector or one-column matrix response using gevff will give the same result but be faster and it handles the xi = 0 case. The function gev implements Tawn (1988) while gevff implements Prescott and Walden (1980).

Function egev() has been replaced by the new family function gevff(). It now conforms to the usual VGAM philosophy of having M1 linear predictors per (independent) response. This is the usual way multiple responses are handled. Hence vglm(cbind(y1, y2)..., gevff, ...) will have 6 linear predictors and it is possible to constrain the linear predictors so that the answer is similar to gev(). Missing values in the response of gevff() will be deleted; this behaviour is the same as with almost every other VGAM family function.

The shape parameter xi is difficult to estimate accurately unless there is a lot of data. Convergence is slow when xi is near -0.5. Given many explanatory variables, it is often a good idea to make sure zero = 3. The range restrictions of the parameter xi are not enforced; thus it is possible for a violation to occur.

Successful convergence often depends on having a reasonably good initial value for xi. If failure occurs try various values for the argument ishape, and if there are covariates, having zero = 3 is advised.

Author(s)

T. W. Yee

References

Yee, T. W. and Stephenson, A. G. (2007). Vector generalized linear and additive extreme value models. Extremes, 10, 1–19.

Tawn, J. A. (1988). An extreme-value theory model for dependent observations. Journal of Hydrology, 101, 227–250.

Prescott, P. and Walden, A. T. (1980). Maximum likelihood estimation of the parameters of the generalized extreme-value distribution. Biometrika, 67, 723–724.

Smith, R. L. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika, 72, 67–90.

See Also

rgev, gumbel, gumbelff, guplot, rlplot.gevff, gpd, weibullR, frechet, extlogitlink, oxtemp, venice, CommonVGAMffArguments.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## Not run: 
# Multivariate example
fit1 <- vgam(cbind(r1, r2) ~ s(year, df = 3), gev(zero = 2:3),
             data = venice, trace = TRUE)
coef(fit1, matrix = TRUE)
head(fitted(fit1))
par(mfrow = c(1, 2), las = 1)
plot(fit1, se = TRUE, lcol = "blue", scol = "forestgreen",
     main = "Fitted mu(year) function (centered)", cex.main = 0.8)
with(venice, matplot(year, depvar(fit1)[, 1:2], ylab = "Sea level (cm)",
     col = 1:2, main = "Highest 2 annual sea levels", cex.main = 0.8))
with(venice, lines(year, fitted(fit1)[,1], lty = "dashed", col = "blue"))
legend("topleft", lty = "dashed", col = "blue", "Fitted 95 percentile")

# Univariate example
(fit <- vglm(maxtemp ~ 1, gevff, data = oxtemp, trace = TRUE))
head(fitted(fit))
coef(fit, matrix = TRUE)
Coef(fit)
vcov(fit)
vcov(fit, untransform = TRUE)
sqrt(diag(vcov(fit)))  # Approximate standard errors
rlplot(fit)

## End(Not run)

Example output

Loading required package: stats4
Loading required package: splines
VGAM  s.vam  loop  1 :  loglikelihood = -430.71161
VGAM  s.vam  loop  2 :  loglikelihood = -429.23562
VGAM  s.vam  loop  3 :  loglikelihood = -430.43616
VGAM  s.vam  loop  4 :  loglikelihood = -430.43683
VGAM  s.vam  loop  5 :  loglikelihood = -430.43491
VGAM  s.vam  loop  6 :  loglikelihood = -430.43465
VGAM  s.vam  loop  7 :  loglikelihood = -430.43392
VGAM  s.vam  loop  8 :  loglikelihood = -430.43417
VGAM  s.vam  loop  9 :  loglikelihood = -430.43399
VGAM  s.vam  loop  10 :  loglikelihood = -430.43408
VGAM  s.vam  loop  11 :  loglikelihood = -430.43403
VGAM  s.vam  loop  12 :  loglikelihood = -430.43406
VGAM  s.vam  loop  13 :  loglikelihood = -430.43404
                    location loge(scale) logoff(shape, offset = 0.5)
(Intercept)     -830.9021373    2.559678                  -0.7022749
s(year, df = 3)    0.4829147    0.000000                   0.0000000
       95%      99%
1 139.8791 160.5978
2 140.3475 161.0662
3 140.8160 161.5347
4 141.2712 161.9899
5 141.7022 162.4209
6 142.1071 162.8258
VGLM    linear loop  1 :  loglikelihood = -228.90576
VGLM    linear loop  2 :  loglikelihood = -228.89653
VGLM    linear loop  3 :  loglikelihood = -228.89652
VGLM    linear loop  4 :  loglikelihood = -228.89652

Call:
vglm(formula = maxtemp ~ 1, family = gevff, data = oxtemp, trace = TRUE)


Coefficients:
(Intercept):1 (Intercept):2 (Intercept):3 
    83.838467      1.449270     -1.547556 

Degrees of Freedom: 240 Total; 237 Residual
Log-likelihood: -228.8965 
       95%      99%
1 92.35044 94.71293
2 92.35044 94.71293
3 92.35044 94.71293
4 92.35044 94.71293
5 92.35044 94.71293
6 92.35044 94.71293
            location loge(scale) logoff(shape, offset = 0.5)
(Intercept) 83.83847     1.44927                   -1.547556
  location      scale      shape 
83.8384672  4.2600036 -0.2872327 
              (Intercept):1 (Intercept):2 (Intercept):3
(Intercept):1  0.2669002973 -0.0009746616   -0.04928180
(Intercept):2 -0.0009746616  0.0072373297   -0.01358841
(Intercept):3 -0.0492818027 -0.0135884143    0.07543028
             location        scale        shape
location  0.266900297 -0.004152062 -0.010485558
scale    -0.004152062  0.131340385 -0.012316398
shape    -0.010485558 -0.012316398  0.003414724
(Intercept):1 (Intercept):2 (Intercept):3 
    0.5166239     0.0850725     0.2746457 

VGAM documentation built on Jan. 16, 2021, 5:21 p.m.