Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/entropy.Dirichlet.R
freqs.Dirichlet
computes the Bayesian estimates
of the bin frequencies using the Dirichlet-multinomial
pseudocount model.
entropy.Dirichlet
estimates the Shannon entropy H of the random variable Y
from the corresponding observed counts y
by plug-in of Bayesian estimates
of the bin frequencies using the Dirichlet-multinomial
pseudocount model.
KL.Dirichlet
computes a Bayesian estimate of the Kullback-Leibler (KL) divergence
from counts y1
and y2
.
chi2.Dirichlet
computes a Bayesian version of the chi-squared divergence
from counts y1
and y2
.
mi.Dirichlet
computes a Bayesian estimate of mutual information of two random variables.
chi2indep.Dirichlet
computes a Bayesian version of the chi-squared divergence of
independence from a table of counts y2d
.
1 2 3 4 5 6 | freqs.Dirichlet(y, a)
entropy.Dirichlet(y, a, unit=c("log", "log2", "log10"))
KL.Dirichlet(y1, y2, a1, a2, unit=c("log", "log2", "log10"))
chi2.Dirichlet(y1, y2, a1, a2, unit=c("log", "log2", "log10"))
mi.Dirichlet(y2d, a, unit=c("log", "log2", "log10"))
chi2indep.Dirichlet(y2d, a, unit=c("log", "log2", "log10"))
|
y |
vector of counts. |
y1 |
vector of counts. |
y2 |
vector of counts. |
y2d |
matrix of counts. |
a |
pseudocount per bin. |
a1 |
pseudocount per bin for first random variable. |
a2 |
pseudocount per bin for second random variable. |
unit |
the unit in which entropy is measured.
The default is "nats" (natural units). For
computing entropy in "bits" set |
The Dirichlet-multinomial pseudocount entropy estimator is a Bayesian plug-in estimator: in the definition of the Shannon entropy the bin probabilities are replaced by the respective Bayesian estimates of the frequencies, using a model with a Dirichlet prior and a multinomial likelihood.
The parameter a
is a parameter of the Dirichlet prior, and in effect
specifies the pseudocount per bin. Popular choices of a
are:
a=0:maximum likelihood estimator (see entropy.empirical
)
a=1/2:Jeffreys' prior; Krichevsky-Trovimov (1991) entropy estimator
a=1:Laplace's prior
a=1/length(y):Schurmann-Grassberger (1996) entropy estimator
a=sqrt(sum(y))/length(y):minimax prior
The pseudocount a
can also be a vector so that for each bin an
individual pseudocount is added.
freqs.Dirichlet
returns the Bayesian estimates of the frequencies .
entropy.Dirichlet
returns the Bayesian estimate of the Shannon entropy.
KL.Dirichlet
returns the Bayesian estimate of the KL divergence.
chi2.Dirichlet
returns the Bayesian version of the chi-squared divergence.
mi.Dirichlet
returns the Bayesian estimate of the mutual information.
chi2indep.Dirichlet
returns the Bayesian version of the chi-squared divergence of independence.
Korbinian Strimmer (https://strimmerlab.github.io).
Agresti, A., and D. B. Hitchcock. 2005. Bayesian inference for categorical data analysis. Stat. Methods. Appl. 14:297–330.
Krichevsky, R. E., and V. K. Trofimov. 1981. The performance of universal encoding. IEEE Trans. Inf. Theory 27: 199-207.
Schurmann, T., and P. Grassberger. 1996. Entropy estimation of symbol sequences. Chaos 6:41-427.
entropy
,
entropy.shrink
,
entropy.empirical
,
entropy.plugin
,
mi.plugin
, KL.plugin
, discretize
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | # load entropy library
library("entropy")
# a single variable
# observed counts for each bin
y = c(4, 2, 3, 0, 2, 4, 0, 0, 2, 1, 1)
# Dirichlet estimate of frequencies with a=1/2
freqs.Dirichlet(y, a=1/2)
# Dirichlet estimate of entropy with a=0
entropy.Dirichlet(y, a=0)
# identical to empirical estimate
entropy.empirical(y)
# Dirichlet estimate with a=1/2 (Jeffreys' prior)
entropy.Dirichlet(y, a=1/2)
# Dirichlet estimate with a=1 (Laplace prior)
entropy.Dirichlet(y, a=1)
# Dirichlet estimate with a=1/length(y)
entropy.Dirichlet(y, a=1/length(y))
# Dirichlet estimate with a=sqrt(sum(y))/length(y)
entropy.Dirichlet(y, a=sqrt(sum(y))/length(y))
# example with two variables
# observed counts for two random variables
y1 = c(4, 2, 3, 1, 10, 4)
y2 = c(2, 3, 7, 1, 4, 3)
# Bayesian estimate of Kullback-Leibler divergence (a=1/6)
KL.Dirichlet(y1, y2, a1=1/6, a2=1/6)
# half of the corresponding chi-squared divergence
0.5*chi2.Dirichlet(y1, y2, a1=1/6, a2=1/6)
## joint distribution example
# contingency table with counts for two discrete variables
y2d = rbind( c(1,2,3), c(6,5,4) )
# Bayesian estimate of mutual information (a=1/6)
mi.Dirichlet(y2d, a=1/6)
# half of the Bayesian chi-squared divergence of independence
0.5*chi2indep.Dirichlet(y2d, a=1/6)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.