CollocateR is a package for the statistical programming language R. Albeit imperfectly, the package increasingly uses functions and workflows from the tidyverse and tidytext packages.
CollocateR serves a simple purpose. It processes collocates for keywords in context in text files and calculates significance for them, based on tests set out in Barnbrook et al's Collocation: Applications and Implications, Palgrave 2013, and formulae explained in the British National Corpus home.
~~- save_collocates: Return a list containing a tokenised version of the original document, a record of the node in original and hashed format, lists of left and right collocate locations, and document word_length.~~ - get_freqs: A frequency count for collocates, both in context and in the document in general - pmi: a 'pointwise mutual information' significance test based on the probability of nodes and collocates occurring together compared to the probability of their occurring independently. - npmi: as above, but normalised so all results occur between 1 (perfect collocation) and -1 (the terms never collocate). - z-score: a probability test comparing probability of collocate occurring in near the node versus its occurrence across the text
README generated with readme2tex.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.