generalized_mean | R Documentation |
Calculate a weighted generalized mean.
generalized_mean(r)
arithmetic_mean(x, w = NULL, na.rm = FALSE)
geometric_mean(x, w = NULL, na.rm = FALSE)
harmonic_mean(x, w = NULL, na.rm = FALSE)
r |
A finite number giving the order of the generalized mean. |
x |
A strictly positive numeric vector. |
w |
A strictly positive numeric vector of weights, the same length as
|
na.rm |
Should missing values in |
The function generalized_mean()
returns a function to compute the
generalized mean of x
with weights w
and exponent r
(i.e., \prod_{i = 1}^{n} x_{i}^{w_{i}}
when r = 0
and
\left(\sum_{i = 1}^{n} w_{i} x_{i}^{r}\right)^{1 / r}
otherwise). This is also called the power mean, Hölder mean, or l_p
mean. See Bullen (2003, p. 175) for a definition, or
https://en.wikipedia.org/wiki/Generalized_mean. The generalized mean
is the solution to the optimal prediction problem: choose m
to
minimize \sum_{i = 1}^{n} w_{i} \left[\log(x_{i}) - \log(m)
\right]^2
when r = 0
, \sum_{i =
1}^{n} w_{i} \left[x_{i}^r - m^r \right]^2
otherwise.
The functions arithmetic_mean()
, geometric_mean()
, and
harmonic_mean()
compute the arithmetic, geometric, and harmonic (or
subcontrary) means, also known as the Pythagorean means. These are the most
useful means for making price indexes, and correspond to setting
r = 1
, r = 0
, and r = -1
in generalized_mean()
.
Both x
and w
should be strictly positive (and finite),
especially for the purpose of making a price index. This is not enforced,
but the results may not make sense if the generalized mean is not defined.
There are two exceptions to this.
The convention in Hardy et al. (1952, p. 13) is used in cases where x
has zeros: the generalized mean is 0 whenever w
is strictly positive and
r
< 0. (The analogous convention holds whenever at least one element of x
is Inf
: the generalized mean is Inf
whenever w
is strictly positive
and r
> 0.)
Some authors let w
be non-negative and sum to 1 (e.g., Sydsaeter
et al., 2005, p. 47). If w
has zeros, then the corresponding element
of x
has no impact on the mean whenever x
is strictly
positive. Unlike weighted.mean()
, however,
zeros in w
are not strong zeros, so infinite values in x
will
propagate even if the corresponding elements of w
are zero.
The weights are scaled to sum to 1 to satisfy the definition of a
generalized mean. There are certain price indexes where the weights should
not be scaled (e.g., the Vartia-I index); use sum()
for
these cases.
The underlying calculation returned by generalized_mean()
is mostly
identical to weighted.mean()
, with one
important exception: missing values in the weights are not treated
differently than missing values in x
. Setting na.rm = TRUE
drops missing values in both x
and w
, not just x
. This
ensures that certain useful identities are satisfied with missing values in
x
. In most cases arithmetic_mean()
is a drop-in replacement
for weighted.mean()
.
generalized_mean()
returns a function:
function(x, w = NULL, na.rm = FALSE){...}
This computes the generalized mean of order r
of x
with
weights w
.
arithmetic_mean()
, geometric_mean()
, and
harmonic_mean()
each return a numeric value for the generalized means
of order 1, 0, and -1.
generalized_mean()
can be defined on the extended real line, so
that r = -Inf / Inf
returns min()
/max()
, to agree with the
definition in, e.g., Bullen (2003). This is not implemented, and r
must be finite.
There are a number of existing functions for calculating unweighted
geometric and harmonic means, namely the geometric.mean()
and
harmonic.mean()
functions in the psych package, the
geomean()
function in the FSA package, the GMean()
and
HMean()
functions in the DescTools package, and the
geoMean()
function in the EnvStats package. Similarly, the
ci_generalized_mean()
function in the Compind package
calculates an unweighted generalized mean.
Bullen, P. S. (2003). Handbook of Means and Their Inequalities. Springer Science+Business Media.
Fisher, I. (1922). The Making of Index Numbers. Houghton Mifflin Company.
Hardy, G., Littlewood, J. E., and Polya, G. (1952). Inequalities (2nd edition). Cambridge University Press.
IMF, ILO, Eurostat, UNECE, OECD, and World Bank. (2020). Consumer Price Index Manual: Concepts and Methods. International Monetary Fund.
Lord, N. (2002). Does Smaller Spread Always Mean Larger Product? The Mathematical Gazette, 86(506): 273-274.
Sydsaeter, K., Strom, A., and Berck, P. (2005). Economists' Mathematical Manual (4th edition). Springer.
transmute_weights()
transforms the weights to turn a generalized
mean of order r
into a generalized mean of order s
.
factor_weights()
calculates the weights to factor a mean of
products into a product of means.
price_indexes and quantity_index()
for simple
wrappers that use generalized_mean()
to calculate common indexes.
back_period()
/base_period()
for a simple utility
function to turn prices in a table into price relatives.
Other means:
extended_mean()
,
lehmer_mean()
,
nested_mean()
x <- 1:3
w <- c(0.25, 0.25, 0.5)
#---- Common generalized means ----
# Arithmetic mean
arithmetic_mean(x, w) # same as weighted.mean(x, w)
# Geometric mean
geometric_mean(x, w) # same as prod(x^w)
# Harmonic mean
harmonic_mean(x, w) # same as 1 / weighted.mean(1 / x, w)
# Quadratic mean / root mean square
generalized_mean(2)(x, w)
# Cubic mean
# Notice that this is larger than the other means so far because
# the generalized mean is increasing in r
generalized_mean(3)(x, w)
#---- Comparing the Pythagorean means ----
# The dispersion between the arithmetic, geometric, and harmonic
# mean usually increases as the variance of 'x' increases
x <- c(1, 3, 5)
y <- c(2, 3, 4)
var(x) > var(y)
arithmetic_mean(x) - geometric_mean(x)
arithmetic_mean(y) - geometric_mean(y)
geometric_mean(x) - harmonic_mean(x)
geometric_mean(y) - harmonic_mean(y)
# But the dispersion between these means is only bounded by the
# variance (Bullen, 2003, p. 156)
arithmetic_mean(x) - geometric_mean(x) >= 2 / 3 * var(x) / (2 * max(x))
arithmetic_mean(x) - geometric_mean(x) <= 2 / 3 * var(x) / (2 * min(x))
# Example by Lord (2002) where the dispersion decreases as the variance
# increases, counter to the claims by Fisher (1922, p. 108) and the
# CPI manual (par. 1.14)
x <- (5 + c(sqrt(5), -sqrt(5), -3)) / 4
y <- (16 + c(7 * sqrt(2), -7 * sqrt(2), 0)) / 16
var(x) > var(y)
arithmetic_mean(x) - geometric_mean(x)
arithmetic_mean(y) - geometric_mean(y)
geometric_mean(x) - harmonic_mean(x)
geometric_mean(y) - harmonic_mean(y)
# The "bias" in the arithmetic and harmonic indexes is also smaller in
# this case, counter to the claim by Fisher (1922, p. 108)
arithmetic_mean(x) * arithmetic_mean(1 / x) - 1
arithmetic_mean(y) * arithmetic_mean(1 / y) - 1
harmonic_mean(x) * harmonic_mean(1 / x) - 1
harmonic_mean(y) * harmonic_mean(1 / y) - 1
#---- Missing values ----
w[2] <- NA
arithmetic_mean(x, w, na.rm = TRUE) # drop the second observation
weighted.mean(x, w, na.rm = TRUE) # still returns NA
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.