Description Usage Arguments Value See Also Examples
View source: R/frequent_ngrams.R
Find a important ngram (2-3) collocations. Wraps collocations
to provide stopword, min/max characters, and stemming with a generic plot
function.
1 2 3 4 5 6 7 8 9 10 11 12 |
text.var |
A vector of character strings. |
n |
The number of rows to include. |
gram.length |
The length of ngram to generate (2-3). |
stopwords |
A vector of stopwords to exclude. |
min.char |
The minimum number of characters a word must be (including apostrophes) for inclusion. |
max.char |
The maximum number of characters a word must be (including apostrophes) for inclusion. |
order.by |
The name of the measure column to order by: |
stem |
logical. If |
language |
The stem language to use (see |
... |
Other arguments passed to |
Retuns a data.frame of terms and frequencies.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | ## Not run:
x <- presidential_debates_2012[["dialogue"]]
frequent_ngrams(x)
frequent_ngrams(x, n = 50)
frequent_ngrams(x, stopwords = c(stopwords::stopwords("english"), "american", "governor"))
frequent_ngrams(x, gram.length = 3)
frequent_ngrams(x, gram.length = 3, stem = TRUE)
frequent_ngrams(x, order.by = "lambda")
plot(frequent_ngrams(x))
plot(frequent_ngrams(x, n = 40))
plot(frequent_ngrams(x, order.by = "lambda"))
plot(frequent_ngrams(x, gram.length = 3))
## End(Not run)
## Not run:
## ngram feature extraction
if (!require("pacman")) install.packages("pacman")
pacman::p_load(termco, dplyr, textshape, magrittr)
ngrams <- presidential_debates_2012 %$%
frequent_ngrams(dialogue, n=10) %>%
pull(collocation) %>%
as_term_list()
ngram_features <- presidential_debates_2012 %>%
with(term_count(dialogue, person, ngrams)) %>%
as_dtm()
ngram_features
## tidied features
ngram_features %>%
textshape::tidy_dtm()
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.