Description Usage Arguments Value Examples
List the most (or least) frequently occurring features in a dfm, either as a whole or separated by document.
1 2 3 4 5 6 7 | topfeatures(
x,
n = 10,
decreasing = TRUE,
scheme = c("count", "docfreq"),
groups = NULL
)
|
x |
the object whose features will be returned |
n |
how many top features should be returned |
decreasing |
If |
scheme |
one of |
groups |
either: a character vector containing the names of document
variables to be used for grouping; or a factor or object that can be
coerced into a factor equal in length or rows to the number of documents.
|
A named numeric vector of feature counts, where the names are the
feature labels, or a list of these if groups
is given.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | dfmat1 <- corpus_subset(data_corpus_inaugural, Year > 1980) %>%
dfm(remove_punct = TRUE)
dfmat2 <- dfm_remove(dfmat1, stopwords("english"))
# most frequent features
topfeatures(dfmat1)
topfeatures(dfmat2)
# least frequent features
topfeatures(dfmat2, decreasing = FALSE)
# top features of individual documents
topfeatures(dfmat2, n = 5, groups = docnames(dfmat2))
# grouping by president last name
topfeatures(dfmat2, n = 5, groups = "President")
# features by document frequencies
tail(topfeatures(dfmat1, scheme = "docfreq", n = 200))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.