# generalized_mean: Generalized mean In gpindex: Generalized Price and Quantity Indexes

## Description

Calculate a generalized mean.

## Usage

 ```1 2 3 4 5 6 7``` ```mean_generalized(r) mean_arithmetic(x, w = rep(1, length(x)), na.rm = FALSE, scale = TRUE) mean_geometric(x, w = rep(1, length(x)), na.rm = FALSE, scale = TRUE) mean_harmonic(x, w = rep(1, length(x)), na.rm = FALSE, scale = TRUE) ```

## Arguments

 `r` A finite number giving the order of the generalized mean. `x` A strictly positive numeric vector. `w` A strictly positive numeric vector of weights, the same length as `x`. The default is to equally weight each element of `x`. `na.rm` Should missing values in `x` and `w` be removed? `scale` Should the weights be scaled to sum to 1?

## Details

The function `mean_generalized()` returns a function to compute the generalized mean of `x` with weights `w` and exponent `r` (i.e., ∏ x^w when r = 0 and (∑ wx^r)^1/r otherwise). This is also called the power mean, Holder mean, or l_p mean. See Bullen (2003, p. 175) for a definition, or https://en.wikipedia.org/wiki/Power_mean. The generalized mean is the solution to the optimal prediction problem: choose m to minimize ∑ w [log(x) - log(m)]^2 when r = 0, ∑ w [x^r - m^r]^2 otherwise.

The functions `mean_arithmetic()`, `mean_geometric()`, and `mean_harmonic()` compute the arithmetic, geometric, and harmonic (or subcontrary) means, also known as the Pythagorean means. These are the most useful means for making price indexes, and correspond to setting `r = 1`, `r = 0`, and `r = -1` in `mean_generalized()`.

Both `x` and `w` should be strictly positive (and finite), especially for the purpose of making a price index. This is not enforced, but the results may not make sense if the generalized mean in not defined. There are two exceptions to this.

1. The convention in Hardy et al. (1952, p. 13) is used in cases where `x` has zeros: the generalized mean is 0 whenever `w` is strictly positive and `r` < 0. (The analogous convention holds whenever at least one element of `x` is `Inf`: the generalized mean is `Inf` whenever `w` is strictly positive and `r > 0`.)

2. Some authors let `w` be non-negative and sum to 1 (e.g., Sydsaeter et al., 2005, p. 47). If `w` has zeros, then the corresponding element of `x` has no impact on the mean whenever `x` is strictly positive. Unlike `weighted.mean()`, however, zeros in `w` are not strong zeros, so infinite values in `x` will propagate even if the corresponding elements of `w` are zero.

The weights should almost always be scaled to sum to 1 to satisfy the definition of a generalized mean, although there are certain types of price indexes where the weights should not be scaled (e.g., the Vartia-I index).

The underlying calculation returned by `mean_generalized()` is mostly identical to `weighted.mean()`, with one important exception: missing values in the weights are not treated differently than missing values in `x`. Setting `na.rm = TRUE` drops missing values in both `x` and `w`, not just `x`. This ensures that certain useful identities are satisfied with missing values in `x`. In most cases `mean_arithmetic()` is a drop-in replacement for `weighted.mean()`.

## Value

`mean_generalized()` returns a function:

`function(x, w = rep(1, length(x)), na.rm = FALSE, scale = TRUE)`.

`mean_arithmetic()`, `mean_geometric()`, and `mean_harmonic()` each return a numeric value.

## Warning

Passing very small values for `r` can give misleading results, and warning is given whenever `abs(r)` is sufficiently small. In general, `r` should not be a computed value.

## Note

`mean_generalized()` can be defined on the extended real line, so that `r = -Inf/Inf` returns `min()`/`max()`, to agree with the definition in, e.g., Bullen (2003). This is not implemented, and `r` must be finite.

There are a number of existing functions for calculating unweighted geometric and harmonic means, namely the `geometric.mean()` and `harmonic.mean()` functions in the `'psych'` package, the `geomean()` function in the `'FSA'` package, the `GMean()` and `HMean()` functions in the `'DescTools'` package, and the `geoMean()` function in the `'EnvStats'` package. Similarly, the `ci_generalized_mean()` function in the `'Compind'` package calculates an unweighted generalized mean.

## References

Bullen, P. S. (2003). Handbook of Means and Their Inequalities. Springer Science+Business Media.

Fisher, I. (1922). The Making of Index Numbers. Houghton Mifflin Company.

Hardy, G., Littlewood, J. E., and Polya, G. (1952). Inequalities (2nd edition). Cambridge University Press.

ILO, IMF, OECD, Eurostat, UN, and World Bank. (2004). Producer Price Index Manual: Theory and Practice. International Monetary Fund.

Lord, N. (2002). Does Smaller Spread Always Mean Larger Product? The Mathematical Gazette, 86(506): 273-274.

Sydsaeter, K., Strom, A., and Berck, P. (2005). Economists' Mathematical Manual (4th edition). Springer.

`logmean_generalized` for the generalized logarithmic mean.

`mean_lehmer` for the Lehmer mean, an alternative to the generalized mean.

`weights_transmute` transforms the weights to turn an r-generalized mean into an s-generalized mean.

`weights_factor` calculates the weights to factor a mean of products into a product of means.

`price_index` and `quantity_index` for simple wrappers that use `mean_generalized()` to calculate common indexes.

`back_price`/`base_price` for a simple utility function to turn prices in a table into price relatives.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124``` ```# Make some data x <- 1:3 w <- c(0.25, 0.25, 0.5) # Arithmetic mean mean_arithmetic(x, w) # same as stats::weighted.mean(x, w) # Geometric mean mean_geometric(x, w) # same as prod(x^w) # Using prod() to manually calculate the geometric mean can give misleading # results z <- 1:1000 prod(z)^(1 / length(z)) # overflow mean_geometric(z) z <- seq(0.0001, by = 0.0005, length.out = 1000) prod(z)^(1 / length(z)) # underflow mean_geometric(z) # Harmonic mean mean_harmonic(x, w) # same as 1 / stats::weighted.mean(1 / x, w) # Quadratic mean / root mean square mean_generalized(2)(x, w) # Cubic mean # Notice that this is larger than the other means so far because the # generalized mean is increasing in r mean_generalized(3)(x, w) #-------------------- # The dispersion between the arithmetic, geometric, and harmonic mean usually # increases as the variance of 'x' increases x <- c(1, 3, 5) y <- c(2, 3, 4) var(x) > var(y) mean_arithmetic(x) - mean_geometric(x) mean_arithmetic(y) - mean_geometric(y) mean_geometric(x) - mean_harmonic(x) mean_geometric(y) - mean_harmonic(y) # But the dispersion between these means is only bounded by # the variance (Bullen, 2003, p. 156) mean_arithmetic(x) - mean_geometric(x) >= 2 / 3 * var(x) / (2 * max(x)) mean_arithmetic(x) - mean_geometric(x) <= 2 / 3 * var(x) / (2 * min(x)) # Example from Lord (2002) where the dispersion decreases as the variance increases, # counter to the claims in Fisher (1922, p. 108) and the PPI manual (p. 28) x <- (5 + c(sqrt(5), -sqrt(5), -3)) / 4 y <- (16 + c(7 * sqrt(2), -7 * sqrt(2), 0)) / 16 var(x) > var(y) mean_arithmetic(x) - mean_geometric(x) mean_arithmetic(y) - mean_geometric(y) mean_geometric(x) - mean_harmonic(x) mean_geometric(y) - mean_harmonic(y) # The "bias" in the arithmetic and harmonic indexes is also smaller in this case, # counter to the claim in Fisher (1922, p. 108) mean_arithmetic(x) * mean_arithmetic(1 / x) - 1 mean_arithmetic(y) * mean_arithmetic(1 / y) - 1 mean_harmonic(x) * mean_harmonic(1 / x) - 1 mean_harmonic(y) * mean_harmonic(1 / y) - 1 #-------------------- # Example of how missing values are handled w <- replace(w, 2, NA) mean_arithmetic(x, w) mean_arithmetic(x, w, na.rm = TRUE) # drops the second observation stats::weighted.mean(x, w, na.rm = TRUE) # still returns NA #-------------------- # Sometimes it makes sense to calculate a generalized mean with # negative inputs, so the warning can be ignored mean_arithmetic(c(1, 2, -3)) # Other times it's less obvious mean_harmonic(c(1, 2, -3)) #-------------------- # A function to make the superlative quadratic mean price index in chapter 17, # section B.5.1, of the PPI manual as a product of generalized means quadratic_index <- function(x, w0, w1, r) { x <- sqrt(x) mean_generalized(r)(x, w0) * mean_generalized(-r)(x, w1) } quadratic_index(1:3, 4:6, 7:9, 2) # Same as the geometric mean of two generalized means (with the order halved) quadratic_index2 <- function(x, w0, w1, r) { res <- c(mean_generalized(r)(x, w0), mean_generalized(-r)(x, w1)) mean_geometric(res) } quadratic_index2(1:3, 4:6, 7:9, 1) ```

gpindex documentation built on Feb. 3, 2021, 1:06 a.m.