Pearson's chi-squared statistic for frequency comparisons (corpora)

Share:

Description

This function computes Pearson's chi-squared statistic (often written as X^2) for frequency comparison data, with or without Yates' continuity correction. The implementation is based on the formula given by Evert (2004, 82).

Usage

1
chisq(k1, n1, k2, n2, correct = TRUE, one.sided=FALSE)

Arguments

k1

frequency of a type in the first corpus (or an integer vector of type frequencies)

n1

the sample size of the first corpus (or an integer vector specifying the sizes of different samples)

k2

frequency of the type in the second corpus (or an integer vector of type frequencies, in parallel to k1)

n2

the sample size of the second corpus (or an integer vector specifying the sizes of different samples, in parallel to n1)

correct

if TRUE, apply Yates' continuity correction (default)

one.sided

if TRUE, compute the signed square root of X^2 as a statistic for a one-sided test (see details below; the default value is FALSE)

Details

The X^2 values returned by this function are identical to those computed by chisq.test. Unlike the latter, chisq accepts vector arguments so that a large number of frequency comparisons can be carried out with a single function call.

The one-sided test statistic (for one.sided=TRUE) is the signed square root of X^2. It is positive for k_1/n_1 > k_2/n_2 and negative for k_1/n_1 < k_2/n_2. Note that this statistic has a standard normal distribution rather than a chi-squared distribution under the null hypothesis of equal proportions.

Value

The chi-squared statistic X^2 corresponding to the specified data (or a vector of X^2 values). This statistic has a chi-squared distribution with df=1 under the null hypothesis of equal proportions.

Author(s)

Stefan Evert

References

Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut für maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93-opus-23714. Available from http://www.collocations.de/phd.html.

See Also

chisq.pval, chisq.test, cont.table