inst/tag/shiny/help/data/filter.md

Filtering and Excluding

Stopwords

"Stopwords" are words in a natural language that need to be filtered out of the corpus before text mining. There is no hard rule as to what is and is not a stopword, and they can be context dependent. Some examples that people generally agree are stopwords are prepositions and indefinite articles.

For example, if the analyst is interested in finding words that are frequently used together in an English language corpus, they may well find that "the" has the highest correlation with the word of interest. This would be a true but useless fact, and removing English stopwords from the corpus first would make the analyst's job much easier.

Exclude List



XSEDEScienceGateways/TAG documentation built on May 9, 2019, 11:05 p.m.