ngrams | R Documentation |
Build out n-grams for multiple text inputs and keep the n most frequent combinations.
ngrams(text, ngram = c(2, 3), top = 10, stop_words = NULL, ...)
text |
Character vector |
ngram |
Integer vector. Number of continuous n items in text. |
top |
Integer. Keep n most frequent ngrams only. |
stop_words |
Character vector. Words to exclude from text. Example: if you want to exclude "a", whenever that word appears it will be excluded, but when the letter "a" appears in a word, it will remain. |
... |
Additional parameters passed to |
data.frame with ngrams and counters, sorted by frequency.
Other Text Mining:
cleanText()
,
remove_stopwords()
,
replaceall()
,
sentimentBreakdown()
,
textCloud()
,
textFeats()
,
textTokenizer()
,
topics_rake()
# You must have "tidytext" library to use this auxiliary function:
## Not run:
women <- read.csv("https://bit.ly/3mXJOOi")
x <- women$description
ngrams(x, ngram = c(2, 3), top = 3)
ngrams(x, ngram = 2, top = 6, stop_words = c("a", "is", "of", "the"))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.