| shannon | R Documentation |
This function computes Shannon's entropy of a variable X with a finite number of categories. Shannon's entropy is a non-spatial measure.
shannon(data)
data |
A data matrix or vector, can be numeric, factor, character, ...
Alternatively, a marked |
Shannon's entropy measures the heterogeneity of a set of categorical data. It is computed as
H(X)=\sum p(x_i) \log(1/p(x_i))
where p(x_i) is the
probability of occurrence of the i-th category, here estimated, as usual, by its relative
frequency. This is both the non parametric and the maximum likelihood estimator for entropy.
Shannon's entropy varies between 0 and \log(I), I being the
number of categories of the variable under study. The relative version of Shannon's entropy, i.e. the entropy divided by
\log(I), is also computed, under the assumption that all data categories are present in the dataset.
The relative entropy is useful for comparison across datasets with differen I.
The function is able to work with lattice data with missing data, as long as they are specified as NAs:
missing data are ignored in the computations.
a list of four elements:
shann Shannon's entropy
range The theoretical range of Shannon's entropy, from 0 to \log(I)
rel.shann Shannon's relative entropy
probabilities a table with absolute frequencies and estimated probabilities (relative frequencies) for all data categories
#NON SPATIAL DATA
shannon(sample(1:5, 50, replace=TRUE))
#POINT DATA
#requires marks with a finite number of categories
data.pp=runifpoint(100, win=square(10))
marks(data.pp)=sample(c("a","b","c"), 100, replace=TRUE)
shannon(marks(data.pp))
#LATTICE DATA
data.lat=matrix(sample(c("a","b","c"), 100, replace=TRUE), nrow=10)
shannon(data.lat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.