tf_idf_by_category: tf-idf analysis of text verbatim, segmented by categorical...

Description Usage Arguments See Also Examples

Description

function performs tf-idf analysis of a given text by categorical variable and plots top n words per group in verbatim by each level of categorical variable. Particularly useful to compare how different two categories are.

Usage

1
2
tf_idf_by_category(df, text_col, categories_col, number_of_words = 1,
  number_of_words_to_plot = 10, clean_text = FALSE, plot = TRUE)

Arguments

df

a dataframe/tribble.

text_col

the name of the text column within df

categories_col

the name of the factor/categorical for segments i.e. facets

number_of_words

return a plot/df of single, bigram or trigrams within each category? returns single words in each category by default

number_of_words_to_plot

how many words/terms to plot within each level of categories_col? Plots Top 10 words in each category by default

clean_text

pre-process text? FALSE by default Lammatizes and get rid of extra spaces before and words before counting

plot

return a ggplot2? TRUE by default

See Also

bind_tf_idf

Examples

1
2
3
4
5
6
7
## Not run: 
data("text_data")
tf_idf_by_category(verbatim,categories_col = NPS_RATING,text_col = text)
tf_idf_by_category(verbatim,categories_col = Qtr,text_col = text)
tf_idf_by_category(verbatim, Qtr, text,number_of_words = 3,clean_text = TRUE)

## End(Not run)

fahadshery/textsummary documentation built on May 6, 2019, 7:02 p.m.