Description Usage Arguments Details Value Author(s) See Also Examples
Gstat
computes the G statistic.
chi2stat
computes the Pearson chi-squared statistic.
Gstatindep
computes the G statistic between the empirical observed joint distribution and the product distribution obtained from its marginals.
chi2statindep
computes the Pearson chi-squared statistic of independence.
1 2 3 4 |
y |
observed vector of counts. |
freqs |
vector of expected frequencies (probability mass function). Alternatively, counts may be provided. |
y2d |
matrix of counts. |
unit |
the unit in which entropy is measured.
The default is "nats" (natural units). For
computing entropy in "bits" set |
The observed counts in y
and y2d
are used to determine the total sample size.
The G statistic equals two times the sample size times the KL divergence between empirical observed frequencies and expected frequencies.
The Pearson chi-squared statistic equals sample size times chi-squared divergence between empirical observed frequencies and expected frequencies. It is a quadratic approximation of the G statistic.
The G statistic between the empirical observed joint distribution and the product distribution obtained from its marginals is equal to two times the sample size times mutual information.
The Pearson chi-squared statistic of independence equals the Pearson chi-squared statistic between the empirical observed joint distribution and the product distribution obtained from its marginals. It is a quadratic approximation of the corresponding G statistic.
The G statistic and the Pearson chi-squared statistic are asymptotically chi-squared distributed which allows to compute corresponding p-values.
A list containing the test statistic stat
, the degree of freedom df
used to calculate the
p-value pval
.
Korbinian Strimmer (https://strimmerlab.github.io).
KL.plugin
,
chi2.plugin
, mi.plugin
, chi2indep.plugin
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | # load entropy library
library("entropy")
## one discrete random variable
# observed counts in each class
y = c(4, 2, 3, 1, 6, 4)
n = sum(y) # 20
# expected frequencies and counts
freqs.expected = c(0.10, 0.15, 0.35, 0.05, 0.20, 0.15)
y.expected = n*freqs.expected
# G statistic (with p-value)
Gstat(y, freqs.expected) # from expected frequencies
Gstat(y, y.expected) # alternatively from expected counts
# G statistic computed from empirical KL divergence
2*n*KL.empirical(y, y.expected)
## Pearson chi-squared statistic (with p-value)
# this can be viewed an approximation of the G statistic
chi2stat(y, freqs.expected) # from expected frequencies
chi2stat(y, y.expected) # alternatively from expected counts
# computed from empirical chi-squared divergence
n*chi2.empirical(y, y.expected)
# compare with built-in function
chisq.test(y, p = freqs.expected)
## joint distribution of two discrete random variables
# contingency table with counts
y.mat = matrix(c(4, 5, 1, 2, 4, 4), ncol = 2) # 3x2 example matrix of counts
n.mat = sum(y.mat) # 20
# G statistic between empirical observed joint distribution and product distribution
Gstatindep( y.mat )
# computed from empirical mutual information
2*n.mat*mi.empirical(y.mat)
# Pearson chi-squared statistic of independence
chi2statindep( y.mat )
# computed from empirical chi-square divergence
n.mat*chi2indep.empirical(y.mat)
# compare with built-in function
chisq.test(y.mat)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.