This function computes Pearson's chisquared statistic (often written as X^2) for frequency comparison data, with or without Yates' continuity correction. The implementation is based on the formula given by Evert (2004, 82).
1 
k1 
frequency of a type in the first corpus (or an integer vector of type frequencies) 
n1 
the sample size of the first corpus (or an integer vector specifying the sizes of different samples) 
k2 
frequency of the type in the second corpus (or an integer
vector of type frequencies, in parallel to 
n2 
the sample size of the second corpus (or an integer vector
specifying the sizes of different samples, in parallel to

correct 
if 
one.sided 
if 
The X^2 values returned by this function are identical to those
computed by chisq.test
. Unlike the latter, chisq
accepts vector arguments so that a large number of frequency
comparisons can be carried out with a single function call.
The onesided test statistic (for one.sided=TRUE
) is the signed
square root of X^2. It is positive for k_1/n_1 > k_2/n_2
and negative for k_1/n_1 < k_2/n_2. Note that this statistic
has a standard normal distribution rather than a chisquared
distribution under the null hypothesis of equal proportions.
The chisquared statistic X^2 corresponding to the specified data (or a vector of X^2 values). This statistic has a chisquared distribution with df=1 under the null hypothesis of equal proportions.
Stefan Evert
Evert, Stefan (2004). The Statistics of Word Cooccurrences: Word Pairs and Collocations. Ph.D. thesis, Institut für maschinelle Sprachverarbeitung, University of Stuttgart. Published in 2005, URN urn:nbn:de:bsz:93opus23714. Available from http://www.collocations.de/phd.html.
chisq.pval
, chisq.test
,
cont.table
Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.