Description Usage Arguments Value
This function takes an existing stopword list and then finds words that are used frequently but appear infrequently in keyphrases. it then adds these non-keywords to you stopword list. If the words are more frequently adjacent to keywords than they are in keywords, they are selected.
1 2 | fortify_stopwords(x, stopwords = smart_stop_words(), n = 0.97,
sample_frac = 1)
|
x |
this is the vector of texts that you want to use to generate additionaly stopwords |
stopwords |
this is the list of stopwords you want to enrich |
n |
is the percentage of the total number of words that you want to consider when looking for common words. It ranges from 0 to 1 but should always be set to a relatively high number to ensure that only commonly used words are added to the stop list |
sample_frac |
this is the percentage of documents in x you want to consider. Provided for big datasets. |
Returns a vector of fortified stopwords
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.