word_freqs_by_category: Plot or get top n word frequencies using English grammar in a...

Description Usage Arguments See Also Examples

Description

Tokenising, Lemmatising, Tagging and Dependency Parsing of raw text using udpipe as a backend. Counts the number of words by each level in a categorical variable. Either plots or returns the tokenised df

Usage

1
2
3
word_freqs_by_category(df, text_col, categories_col,
  number_of_words_to_plot = 10, plot = TRUE, grammer_phrase = "NOUN",
  word_type = lemma)

Arguments

df

a data.frame or a tibble/tribble

text_col

the name of the text column within df

categories_col

the name of the factor/categorical column to calculate the words in each category or level

number_of_words_to_plot

how many words/terms to plot within each level of categories_col? Plots Top 10 words in each category by default

plot

return a ggplot2? or get a tokenised/lemmatised df created by udpipe model. TRUE by default

grammer_phrase

what to filter on? Possible options include all the universal parts of speech tags such as noun, verb, adj, pron, aux, num etc. more info here: https://polyglot.readthedocs.io/en/latest/POS.html

word_type

tokens or lemmas to plot

See Also

word_frequencies_by_category

Examples

1
2
3
4
5
6
7
## Not run: 
data("text_data")
word_freqs_by_category(verbatim,text_col = text, categories_col = NPS_RATING)
word_freqs_by_category(verbatim,text_col = text, categories_col = NPS_RATING, word_type = token)
word_freqs_by_category(verbatim,text,Qtr,number_of_words_to_plot = 20)

## End(Not run)

fahadshery/textsummary documentation built on May 6, 2019, 7:02 p.m.