eevd: Estimate Parameters of an Extreme Value (Gumbel) Distribution In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

 eevd R Documentation

Estimate Parameters of an Extreme Value (Gumbel) Distribution

Description

Estimate the location and scale parameters of an extreme value distribution, and optionally construct a confidence interval for one of the parameters.

Usage

  eevd(x, method = "mle", pwme.method = "unbiased",
plot.pos.cons = c(a = 0.35, b = 0), ci = FALSE,
ci.parameter = "location", ci.type = "two-sided",
ci.method = "normal.approx", conf.level = 0.95)


Arguments

 x numeric vector of observations. method character string specifying the method of estimation. Possible values are "mle" (maximum likelihood; the default), "mme" (methods of moments), "mmue" (method of moments based on the unbiased estimator of variance), and "pwme" (probability-weighted moments). See the DETAILS section for more information on these estimation methods. pwme.method character string specifying what method to use to compute the probability-weighted moments when method="pwme". The possible values are "ubiased" (method based on the U-statistic; the default), or "plotting.position" (method based on the plotting position formula). See the DETAILS section in this help file and the help file for pwMoment for more information. This argument is ignored if method is not equal to "pwme". plot.pos.cons numeric vector of length 2 specifying the constants used in the formula for the plotting positions when method="pwme" and pwme.method="plotting.position". The default value is plot.pos.cons=c(a=0.35, b=0). If this vector has a names attribute with the value c("a","b") or c("b","a"), then the elements will be matched by name in the formula for computing the plotting positions. Otherwise, the first element is mapped to the name "a" and the second element to the name "b". See the DETAILS section in this help file and the help file for pwMoment for more information. This argument is ignored if method is not equal to "pwme" or if pwme.method="ubiased". ci logical scalar indicating whether to compute a confidence interval for the location or scale parameter. The default value is FALSE. ci.parameter character string indicating the parameter for which the confidence interval is desired. The possible values are "location" (the default) and "scale". This argument is ignored if ci=FALSE. ci.type character string indicating what kind of confidence interval to compute. The possible values are "two-sided" (the default), "lower", and "upper". This argument is ignored if ci=FALSE. ci.method character string indicating what method to use to construct the confidence interval for the location or scale parameter. Currently, the only possible value is "normal.approx" (the default). See the DETAILS section for more information. This argument is ignored if ci=FALSE. conf.level a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is conf.level=0.95. This argument is ignored if ci=FALSE.

Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

Let \underline{x} = (x_1, x_2, \ldots, x_n) be a vector of n observations from an extreme value distribution with parameters location=\eta and scale=\theta.

Estimation

Maximum Likelihood Estimation (method="mle")
The maximum likelihood estimators (mle's) of \eta and \theta are the solutions of the simultaneous equations (Forbes et al., 2011):

\hat{\eta}_mle = \hat{\theta}_mle \, log[\frac{1}{n} \sum_{i=1}^{n} exp(\frac{-x_i}{\hat{\theta}_mle})]

\hat{\theta}_mle = \bar{x} - \frac{\sum_{i=1}^{n} x_i exp(\frac{-x_i}{\hat{\theta}_mle})}{\sum_{i=1}^{n} exp(\frac{-x_i}{\hat{\theta}_mle})}

where

\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i

.

Method of Moments Estimation (method="mme")
The method of moments estimators (mme's) of \eta and \theta are given by (Johnson et al., 1995, p.27):

\hat{\eta}_{mme} = \bar{x} - \epsilon \hat{\theta}_{mme}

\hat{\theta}_{mme} = \frac{\sqrt{6}}{\pi} s_m

where \epsilon denotes Euler's constant and s_m denotes the square root of the method of moments estimator of variance:

s_m^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar{x})^2

Method of Moments Estimators Based on the Unbiased Estimator of Variance (method="mmue")
These estimators are the same as the method of moments estimators except that the method of moments estimator of variance is replaced with the unbiased estimator of variance:

s^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \bar{x})^2

Probability-Weighted Moments Estimation (method="pwme")
Greenwood et al. (1979) show that the relationship between the distribution parameters \eta and \theta and the probability-weighted moments is given by:

\eta = M(1, 0, 0) - \epsilon \theta

\theta = \frac{M(1, 0, 0) - 2M(1, 0, 1)}{log(2)}

where M(i, j, k) denotes the ijk'th probability-weighted moment and \epsilon denotes Euler's constant. The probability-weighted moment estimators (pwme's) of \eta and \theta are computed by simply replacing the M(i,j,k)'s in the above two equations with estimates of the M(i,j,k)'s (and for the estimate of \eta, replacing \theta with its estimated value). See the help file for pwMoment for more information on how to estimate the M(i,j,k)'s. Also, see Landwehr et al. (1979) for an example of this method of estimation using the unbiased (U-statistic type) probability-weighted moment estimators. Hosking et al. (1985) note that this method of estimation using the U-statistic type probability-weighted moments is equivalent to Downton's (1966) linear estimates with linear coefficients.

Confidence Intervals
When ci=TRUE, an approximate (1-\alpha)100% confidence intervals for \eta can be constructed assuming the distribution of the estimator of \eta is approximately normally distributed. A two-sided confidence interval is constructed as:

[\hat{\eta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}, \, \hat{\eta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\eta}}]

where t(\nu, p) is the p'th quantile of Student's t-distribution with \nu degrees of freedom, and the quantity

\hat{\sigma}_{\hat{\eta}}

denotes the estimated asymptotic standard deviation of the estimator of \eta.

Similarly, a two-sided confidence interval for \theta is constructed as:

[\hat{\theta} - t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}, \, \hat{\theta} + t(n-1, 1-\alpha/2) \hat{\sigma}_{\hat{\theta}}]

One-sided confidence intervals for \eta and \theta are computed in a similar fashion.

Maximum Likelihood (method="mle")
Downton (1966) shows that the estimated asymptotic variances of the mle's of \eta and \theta are given by:

\hat{\sigma}_{\hat{\eta}_mle}^2 = \frac{\hat{\theta}_mle^2}{n} [1 + \frac{6(1 - \epsilon)^2}{\pi^2}] = \frac{1.10867 \hat{\theta}_mle^2}{n}

\hat{\sigma}_{\hat{\theta}_mle}^2 = \frac{6}{\pi^2} \frac{\hat{\theta}_mle^2}{n} = \frac{0.60793 \hat{\theta}_mle^2}{n}

where \epsilon denotes Euler's constant.

Method of Moments (method="mme" or method="mmue")
Tiago de Oliveira (1963) and Johnson et al. (1995, p.27) show that the estimated asymptotic variance of the mme's of \eta and \theta are given by:

\hat{\sigma}_{\hat{\eta}_mme}^2 = \frac{\hat{\theta}_mme^2}{n} [\frac{\pi^2}{6} + \frac{\epsilon^2}{4}(\beta_2 - 1) - \frac{\pi \epsilon \sqrt{\beta_1}}{\sqrt{6}}] = \frac{1.1678 \hat{\theta}_mme^2}{n}

\hat{\sigma}_{\hat{\theta}_mme}^2 = \frac{\hat{\theta}_mle^2}{n} \frac{(\beta_2 - 1)}{4} = \frac{1.1 \hat{\theta}_mme^2}{n}

where the quantities

\sqrt{\beta_1}, \; \beta_2

denote the skew and kurtosis of the distribution, and \epsilon denotes Euler's constant.

The estimated asymptotic variances of the mmue's of \eta and \theta are the same, except replace the mme of \theta in the above equations with the mmue of \theta.

Probability-Weighted Moments (method="pwme")
As stated above, Hosking et al. (1985) note that this method of estimation using the U-statistic type probability-weighted moments is equivalent to Downton's (1966) linear estimates with linear coefficients. Downton (1966) provides exact values of the variances of the estimates of location and scale parameters for the smallest extreme value distribution. For the largest extreme value distribution, the formula for the estimate of scale is the same, but the formula for the estimate of location must be modified. Thus, Downton's (1966) equation (3.4) is modified to:

\hat{\eta}_pwme = \frac{(n-1)log(2) + (n+1)\epsilon}{n(n-1)log(2)} v - \frac{2 \epsilon}{n(n-1)log(2)} w

where \epsilon denotes Euler's constant, and v and w are defined in Downton (1966, p.8). Using Downton's (1966) equations (3.9)-(3.12), the exact variance of the pwme of \eta can be derived. Note that when method="pwme" and pwme.method="plotting.position", these are only the asymptotically correct variances.

Value

a list of class "estimate" containing the estimated parameters and other information. See
estimate.object for details.

Note

There are three families of extreme value distributions. The one described here is the Type I, also called the Gumbel extreme value distribution or simply Gumbel distribution. The name “extreme value” comes from the fact that this distribution is the limiting distribution (as n approaches infinity) of the greatest value among n independent random variables each having the same continuous distribution.

The Gumbel extreme value distribution is related to the exponential distribution as follows. Let Y be an exponential random variable with parameter rate=\lambda. Then X = \eta - log(Y) has an extreme value distribution with parameters location=\eta and scale=1/\lambda.

The distribution described above and assumed by eevd is the largest extreme value distribution. The smallest extreme value distribution is the limiting distribution (as n approaches infinity) of the smallest value among n independent random variables each having the same continuous distribution. If X has a largest extreme value distribution with parameters location=\eta and scale=\theta, then Y = -X has a smallest extreme value distribution with parameters location=-\eta and scale=\theta. The smallest extreme value distribution is related to the Weibull distribution as follows. Let Y be a Weibull random variable with parameters shape=\beta and scale=\alpha. Then X = log(Y) has a smallest extreme value distribution with parameters location=log(\alpha) and scale=1/\beta.

The extreme value distribution has been used extensively to model the distribution of streamflow, flooding, rainfall, temperature, wind speed, and other meteorological variables, as well as material strength and life data.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Castillo, E. (1988). Extreme Value Theory in Engineering. Academic Press, New York, pp.184–198.

Downton, F. (1966). Linear Estimates of Parameters in the Extreme Value Distribution. Technometrics 8(1), 3–17.

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Greenwood, J.A., J.M. Landwehr, N.C. Matalas, and J.R. Wallis. (1979). Probability Weighted Moments: Definition and Relation to Parameters of Several Distributions Expressible in Inverse Form. Water Resources Research 15(5), 1049–1054.

Hosking, J.R.M., J.R. Wallis, and E.F. Wood. (1985). Estimation of the Generalized Extreme-Value Distribution by the Method of Probability-Weighted Moments. Technometrics 27(3), 251–261.

Johnson, N. L., S. Kotz, and N. Balakrishnan. (1995). Continuous Univariate Distributions, Volume 2. Second Edition. John Wiley and Sons, New York.

Landwehr, J.M., N.C. Matalas, and J.R. Wallis. (1979). Probability Weighted Moments Compared With Some Traditional Techniques in Estimating Gumbel Parameters and Quantiles. Water Resources Research 15(5), 1055–1064.

Tiago de Oliveira, J. (1963). Decision Results for the Parameters of the Extreme Value (Gumbel) Distribution Based on the Mean and Standard Deviation. Trabajos de Estadistica 14, 61–81.

Extreme Value Distribution, Euler's Constant.

Examples

  # Generate 20 observations from an extreme value distribution with
# parameters location=2 and scale=1, then estimate the parameters
# and construct a 90% confidence interval for the location parameter.
# (Note: the call to set.seed simply allows you to reproduce this example.)

set.seed(250)
dat <- revd(20, location = 2)
eevd(dat, ci = TRUE, conf.level = 0.9)

#Results of Distribution Parameter Estimation
#--------------------------------------------
#
#Assumed Distribution:            Extreme Value
#
#Estimated Parameter(s):          location = 1.9684093
#                                 scale    = 0.7481955
#
#Estimation Method:               mle
#
#Data:                            dat
#
#Sample Size:                     20
#
#Confidence Interval for:         location
#
#Confidence Interval Method:      Normal Approximation
#                                 (t Distribution)
#
#Confidence Interval Type:        two-sided
#
#Confidence Level:                90%
#
#Confidence Interval:             LCL = 1.663809
#                                 UCL = 2.273009

#----------

#Compare the values of the different types of estimators:

eevd(dat, method = "mle")$parameters # location scale #1.9684093 0.7481955 eevd(dat, method = "mme")$parameters
# location     scale
#1.9575980 0.8339256

eevd(dat, method = "mmue")$parameters # location scale #1.9450932 0.8555896 eevd(dat, method = "pwme")$parameters
# location     scale
#1.9434922 0.8583633

#----------

# Clean up
#---------
rm(dat)


EnvStats documentation built on Aug. 22, 2023, 5:09 p.m.