expect.max.ostat: Compute the Expectation of a Maximum (or Minimum and others)...
In lmomco: L-Moments, Censored L-Moments, Trimmed L-Moments, L-Comoments, and Many Distributions

expect.max.ostat

R Documentation

Compute the Expectation of a Maximum (or Minimum and others) Order Statistic

Description

The maximum (or minimum) expectation of an order statistic can be directly used for L-moment computation through either of the following two equations (Hosking, 2006) as dictated by using the maximum (\mathrm{E}[X_{k:k}], expect.max.ostat) or minimum (\mathrm{E}[X_{1:k}], expect.min.ostat):

\lambda_r = (-1)^{r-1} \sum_{k=1}^r (-1)^{r-k}k^{-1}{r-1 \choose k-1}{r+k-2 \choose k-1}\mathrm{E}[X_{1:k}]\mbox{,}

and

\lambda_r = \sum_{k=1}^r (-1)^{r-k}k^{-1}{r-1 \choose k-1}{r+k-2 \choose k-1}\mathrm{E}[X_{k:k}]\mbox{.}

In terms of the quantile function qlmomco, the expectation of an order statistic (Asquith, 2011, p. 49) is

\mathrm{E}[X_{j:n}] = n {n-1 \choose j - 1}\int^1_0 \! x(F)\times F^{j-1} \times (1-F)^{n-j}\; \mathrm{d}F\mbox{,}

where x(F) is the quantile function, F is nonexceedance probability, n is sample size, and j is the jth order statistic.

In terms of the probability density function (PDF) dlmomco and cumulative density function (CDF) plmomco, the expectation of an order statistic (Asquith, 2011, p. 50) is

\mathrm{E}[X_{j:n}] = \frac{1}{\mathrm{B}(j,n-j+1)}\int_{-\infty}^{\infty} [F(x)]^{j-1}[1-F(x)]^{n-j} x\, f(x)\;\mathrm{d} x\mbox{,}

where F(x) is the CDF, f(x) is the PDF, and \mathrm{B}(j, n-j+1) is the complete Beta function, which in R is beta with the same argument order as shown above.

Usage

expect.max.ostat(n, para=NULL, cdf=NULL, pdf=NULL, qua=NULL,
                 j=NULL, lower=-Inf, upper=Inf, aslist=FALSE, ...)

Arguments

`n`	The sample size.
`para`	A distribution parameter list from a function such as `vec2par` or `lmom2par`.
`cdf`	cumulative distribution function of the distribution.
`pdf`	probability density function of the distribution.
`qua`	quantile function of the distribution. If this is defined, then `cdf` and `pdf` are ignored.
`j`	The `j`th value of the order statistic, which defaults to `n=j` (the maximum order statistic) if `j=NULL`.
`lower`	The lower limit for integration.
`upper`	The upper limit for integration.
`aslist`	A logically triggering whether an R `list` is returned instead of just the expection.
`...`	Additional arguments to pass to the three distribution functions.

Details

If qua != NULL, then the first order-statistic expectation equation above is used, and any function that might have been set in cdf and pdf is ignored. If the limits are infinite (default), then the limits of the integration will be set to F\!\downarrow = 0 and F\!\uparrow = 1. The user can replace these by setting the limits to something “near” zero and(or) “near” 1. Please consult the Note below concerning more information about the limits of integration.

If qua == NULL, then the second order-statistic expectation equation above is used and cdf and pdf must be set. The default \pm\infty limits are used unless the user knows otherwise for the distribution or through supervision provides their meaning of small and large.

This function requires the user to provide either the qua or the cdf and pdf functions, which is somewhat divergent from the typical flow of logic of lmomco. This has been done so that expect.max.ostat can be used readily for experimental distribution functions. It is suggested that the parameter object be left in the lmomco style (see vec2par) even if the user is providing their own distribution functions.

Last comments: This function is built around the idea that either (1) the cdf and pdf ensemble or (2) qua exist in some clean analytical form and therefore the qua=NULL is the trigger on which order statistic expectation integral is used. This precludes an attempt to compute the support of the distribution internally, and thus providing possibly superior (more refined) lower and upper limits. Here is a suggested re-implementation using the support of the Generalized Extreme Value distribution:

para <- vec2par(c(100, 23, -0.5), type="gev")
lo <- quagev(0, para) # The value 54
hi <- quagev(1, para) # Infinity
E22 <- expect.max.ostat(2, para=para,cdf=cdfgev, pdf=pdfgev,
                           lower=lo, upper=hi)
E21 <- expect.min.ostat(2, para=para,cdf=cdfgev, pdf=pdfgev,
                           lower=lo, upper=hi)
L2 <- (E22 - E21)/2 # definition of L-scale
cat("L-scale: ", L2, "(integration)",
    lmomgev(para)$lambdas[2], "(theory)\n")
# The results show 33.77202 as L-scale.

The design intent makes it possible for some arbitrary and(or) new quantile function with difficult cdf and pdf expressions (or numerical approximations) to not be needed as the L-moments are explored. Contrarily, perhaps some new pdf exists and simple integration of it is made to get the cdf but the qua would need more elaborate numerics to invert the cdf. The user could then still explore the L-moments with supervision on the integration limits or foreknowledge of the support of the distribution.

Value

The expectation of the maximum order statistic, unless j is specified and then the expectation of that order statistic is returned. This similarly holds if the expect.min.ostat function is used except “maximum” becomes the “minimum”.

Alternatively, an R list is returned.

`type`	The type of approach used: “bypdfcdf” means the PDF and CDF of the distribution were used, and alternatively “byqua” means that the quantile function was used.
`value`	See previous discussion of value.
`abs.error`	Estimate of the modulus of the absolute error from R function `integrate`.
`subdivisions`	The number of subintervals produced in the subdivision process from R function `integrate`.
`message`	“OK” or a character string giving the error message.

Note

A function such as this might be helpful for computations involving distribution mixtures. Mixtures are readily made using the algebra of quantile functions (Gilchrist, 2000; Asquith, 2011, sec. 2.1.5 “The Algebra of Quantile Functions”).

Last comments: Internally, judicious use of logarithms and exponents for the terms involving the F and 1-F and the quantities to the left of the intergrals shown above are made in an attempt to maximize stability of the function without the user having to become too invested in the lower and upper limits. For example, (1-F)^{n-j} \rightarrow \exp([n-j]\log(1-F)). Testing indicates that this coding practice is quite useful. But there will undoubtedly be times for which the user needs to be informed enough about the expected value on return to identify that tweaking to the integration limits is needed. Also use of R functions lbeta and lchoose is made to maximize operations in logarithmic space.

For lmomco v.2.1.+: Because of the extensive use of exponents and logarithms as described, enhanced deep tail estimation of the extrema for large n and large or small j results. This has come at the expense that expectations can be computed when the expectations actually do not exist. An error in the integration no longer occurs in lmomco. For example, the Cauchy distribution has infinite extrema but this function (for least for a selected parameter set and n=10) provides apparent values for the \mathrm{E}[X_{1:n}] and \mathrm{E}[X_{n:n}] when the cdf and pdf are used but not when the qua is used. Users are cautioned to not rely on expect.max.ostat “knowing” that a given distribution has undefined order statistic extrema. Now for the Cauchy case just described, the extrema for j = [1, n] are hugely(!) greater in magnitude than for j = [2, (n-1)], so some resemblance of infinity remains.

The alias eostat is a shorter name dispatching to expect.max.ostat all of the arguments.

Author(s)

W.H. Asquith

References

Asquith, W.H., 2011, Distributional analysis with L-moment statistics using the R environment for statistical computing: Createspace Independent Publishing Platform, ISBN 978–146350841–8.

Gilchrist, W.G., 2000, Statistical modelling with quantile functions: Chapman and Hall/CRC, Boca Raton.

Hosking, J.R.M., 2006, On the characterization of distributions by their L-moments: Journal of Statistical Planning and Inference, v. 136, no. 1, pp. 193–198.

Examples

para <- vec2par(c(10, 100), type="nor")
n <- 12
# The three outputted values from should be similar:
# (1) theoretical, (2) theoretical, and (3) simulation
expect.max.ostat(n, para=para, cdf=cdfnor, pdf=pdfnor)
expect.max.ostat(n, para=para, qua=quanor)
mean(sapply(seq_len(1000), function(x) { max(rlmomco(n, para))}))

eostat(8, j=5, qua=quagum, para=vec2par(c(1670, 1000), type="gum"))

## Not run: 
para <- vec2par(c(1280, 800), type="nor")
expect.max.ostat(10, j=9, para, qua=quanor)
[1] 2081.086      # SUCCESS ---------------------------
expect.max.ostat(10, j=9, para, pdf=pdfnor, cdf=cdfnor,
                                lower=-1E3, upper=1E6)
[1] 1.662701e-06  # FAILURE ---------------------------
expect.max.ostat(10, j=9, para, pdf=pdfnor, cdf=cdfnor,
                                lower=-1E3, upper=1E5)
[1] 2081.086      # SUCCESS ---------------------------
## End(Not run)

lmomco documentation built on May 29, 2024, 10:06 a.m.