View source: R/discrete-summaries.R
| entropy | R Documentation |
Normalized entropy, for measuring dispersion in draws from categorical distributions.
entropy(x)
## Default S3 method:
entropy(x)
## S3 method for class 'rvar'
entropy(x)
x |
(multiple options) A vector to be interpreted as draws from a categorical distribution, such as:
|
Calculates the normalized Shannon entropy of the draws in x. This value is
the entropy of x divided by the maximum entropy of a distribution with n
categories, where n is length(unique(x)) for numeric vectors and
length(levels(x)) for factors:
-\frac{\sum_{i = 1}^{n} p_i \log(p_i)}{\log(n)}
This scales the output to be between 0 (all probability in one category)
and 1 (uniform). This form of normalized entropy is referred to as
H_\mathrm{REL} in Wilcox (1967).
If x is a factor or numeric, returns a length-1 numeric vector with a value
between 0 and 1 (inclusive) giving the normalized Shannon entropy of x.
If x is an rvar, returns an array of the same shape as x, where each
cell is the normalized Shannon entropy of the draws in the corresponding cell of x.
Allen R. Wilcox (1967). Indices of Qualitative Variation (No. ORNL-TM-1919). Oak Ridge National Lab., Tenn.
set.seed(1234)
levels <- c("a", "b", "c", "d", "e")
# a uniform distribution: high normalized entropy
x <- factor(
sample(levels, 4000, replace = TRUE, prob = c(0.2, 0.2, 0.2, 0.2, 0.2)),
levels = levels
)
entropy(x)
# a unimodal distribution: low normalized entropy
y <- factor(
sample(levels, 4000, replace = TRUE, prob = c(0.95, 0.02, 0.015, 0.01, 0.005)),
levels = levels
)
entropy(y)
# both together, as an rvar
xy <- c(rvar(x), rvar(y))
xy
entropy(xy)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.