wordcounts_remove_rare | R Documentation |
Filter out the words in a wordcounts dataframe whose overall frequency is below a threshold.
wordcounts_remove_rare(counts, n)
counts |
The dataframe from |
n |
The maximum rank to keep: all words with frequency rank below
|
It's often useful to prune documents of one-off words (many of which are OCR errors) before building MALLET instances. This is a convenience function for doing so.
A filtered word-counts dataframe. Because of ties, do not expect
it to have exactly n
distinct words.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.