Step corpus with annotation.
Given a VCorpus of original text, returns a VCorpus of stemmed text with '+' appended to all stemmed words.
True means do progress bar to watch progress.
This is non-optimized code that is expensive to run. First the stemmer chops words. Then this method passes through and adds a "+" to all chopped words, and builds a list of stems. Finally, the method passes through and adds a "+" to all stems found without a suffix.
So, e.g., goblins and goblin will both be "goblin+".
Code based on code from Kevin Wu, UC Berkeley Undergrad Thesis 2014.
Requires, via the tm package, the SnowballC package.
1 2 3 4 5 6 7 8 9
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.