Description Usage Arguments Value
Determine the most common n-grams used in a column of text responses, optionally broken down by a demographic column(s).
1 2 3 |
data |
dataframe or tibble with a row per survey response |
column |
name of a character column in the data frame to be tabulated |
... |
optional column(s) to split into groups |
words |
number indicating what kind of n-grams to return (bigram, trigram...), defaults to 2 (bigrams) |
filter_word |
optional word to filter results by (i.e. only show n-grams containing this word) |
remove |
optional vector of words to exclude (i.e. remove all n-grams containing at least one of these words) |
n |
number of n-grams to show for each group, defaults to 3 |
min |
number indicating the minimum number of times a word needs to appear for it to be included in output, defaults to 3 |
stop_thresh |
numeric indicating the threshold to remove stopwords (i.e. maximum proportion of stopwords to words allowed). 1 includes all n-grams regardless of stop words, 0 excludes all n-grams containing one or more stopwords. Defaults to 0.7. |
proportion |
logical indicating whether to include the proportion of responses containing this n-gram, defaults to FALSE |
pretty |
one of either 'no', 'plot' or 'return'. Defaults to 'no'. 'plot' will end the function call by applying the prettify() function to the output with plot = TRUE. 'return' will apply the prettify() function with plot = FALSE. |
Table of n-grams with the number of times they appear
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.