freq_terms: Find Frequent Terms

Description Usage Arguments Value See Also Examples

View source: R/freq_terms.R

Description

Find the most frequently occurring terms in a text vector.

Usage

1
2
3
4
5
6
7
8
freq_terms(
  text.var,
  top = 20,
  at.least = 1,
  stopwords = NULL,
  extend = TRUE,
  ...
)

Arguments

text.var

The text variable.

top

Top number of terms to show.

at.least

An integer indicating at least how many letters a word must be to be included in the output.

stopwords

A character vector of words to remove from the text. qdap has a number of data sets that can be used as stop words including: Top200Words, Top100Words, Top25Words. For the tm package's traditional English stop words use tm::stopwords("english").

extend

logical. If TRUE the top argument is extended to any word that has the same frequency as the top word.

...

Other arguments passed to all_words.

Value

Returns a dataframe with the top occurring words.

See Also

word_list, all_words

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## Not run: 
freq_terms(DATA$state, 5)
freq_terms(DATA$state)
freq_terms(DATA$state, extend = FALSE)
freq_terms(DATA$state, at.least = 4)
(out <- freq_terms(pres_debates2012$dialogue, stopwords = Top200Words))
plot(out)

## All words by sentence (row)
library(qdapTools)
x <- raj$dialogue
list_df2df(setNames(lapply(x, freq_terms, top=Inf), seq_along(x)), "row")
list_df2df(setNames(lapply(x, freq_terms, top=10, stopwords = Dolch), 
    seq_along(x)), "Title")


## All words by person
FUN <- function(x, n=Inf) freq_terms(paste(x, collapse=" "), top=n)
list_df2df(lapply(split(x, raj$person), FUN), "person")

## Plot it
out <- lapply(split(x, raj$person), FUN, n=10)
pdf("Freq Terms by Person.pdf", width=13) 
lapply(seq_along(out), function(i) {
    ## dev.new()
    plot(out[[i]], plot=FALSE) + ggtitle(names(out)[i])
})
dev.off()

## Keep spaces
freq_terms(space_fill(DATA$state, "are you"), 500, char.keep="~~")

## End(Not run)

trinker/qdap documentation built on Sept. 30, 2020, 6:28 p.m.