View source: R/Personal_Functions.R
Stopword_Maker | R Documentation |
This function finds the $N$ most used words in a corpus. This is done to identify stop words to better prune data sets before training.
Stopword_Maker(titles, cutoff = 20)
titles |
The documents in which the most populous words are sought. |
cutoff |
The number of $N$ top most used words to keep as stop words. |
output |
A vector of the $N$ most populous words. |
Travis Barton
test_set = c('this is a testset', 'I am searching for a list of words', 'I like turtles', 'A rocket would be a fast way of getting to work, but I do not think it is very practical') res = Stopword_Maker(test_set, 4) print(res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.