plot_tfidf_ngrams: Plot the _n_-grams with the highest TF-IDFs

Description Usage Arguments Value Examples

View source: R/plot_tfidf_ngrams.R

Description

Plot the n-grams with the highest TF-IDFs

Usage

1
plot_tfidf_ngrams(tfidf_ngrams, title = NULL)

Arguments

tfidf_ngrams

A data frame from calc_tfidf_ngrams.

Value

A ggplot (ggplot::geom_col).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
library(experienceAnalysis)
books <- janeaustenr::austen_books() # Jane Austen books
emma <- paste(books[books$book == "Emma", ], collapse = " ") # String with whole book
pp <- paste(books[books$book == "Pride & Prejudice", ], collapse = " ") # String with whole book

# Make data frame with books Emma and Pride & Prejudice
x <- data.frame(
  text = c(emma, pp),
  book = c("Emma", "Pride & Prejudice")
)

calc_tfidf_ngrams(x, target_col_name = "book", text_col_name = "text",
                  filter_class = "Emma",
                  ngrams_type = "Bigrams",
                  number_of_ngrams = 5
) %>%
  dplyr::filter(ngram != "4 4") %>% # First bigram is useless and distorts the plot
  plot_tfidf_ngrams(title = "Bigrams with highest TF-IDFs in Emma")

CDU-data-science-team/experienceAnalysis documentation built on Dec. 17, 2021, 12:53 p.m.