important_terms: Top Min-Max Scaled TF-IDF terms

Description Usage Arguments Value Examples

Description

View the top n min-max scaled tf-idf weighted terms in a text.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
important_terms(
  text.var,
  n = 20,
  stopwords = stopwords::stopwords("english"),
  stem = FALSE,
  language = "porter",
  strip = TRUE,
  strip.regex = "[^A-Za-z' ]",
  ...
)

Arguments

text.var

A vector of character strings.

n

The number of rows to print. If integer selects the frequency at the nth row and prints all rows >= that value. If proportional (less than 0) the frequency value for the nth% row is selected and prints all rows >= that value.

stopwords

A vector of stopwords to exclude.

stem

logical. If TRUE the wordStem is used with language = "porter" as the default. Note that stopwords will be stemmed as well.

language

The stem language to use (see wordStem).

strip

logical. If TRUE all values that are not alpha, apostrophe, or spaces are stripped. This regex can be changed via the strip.regex argument.

strip.regex

A regular expression used for stripping undesired characters.

...

remove_stopwords

Value

Returns a data.frame of terms and min-max scaled tf-idf weights.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
## Not run: 
x <- presidential_debates_2012[["dialogue"]]

frequent_terms(x)
important_terms(x)
important_terms(x, n=899)
important_terms(x, n=.1)
important_terms(x, min.char = 7)
important_terms(x, min.char = 6, stem=TRUE)

plot(important_terms(x))
plot(important_terms(x, n = .02))
plot(important_terms(x, n = 40))
plot(important_terms(x, n = 100), as.cloud = TRUE)

## End(Not run)

trinker/termco documentation built on Jan. 7, 2022, 3:32 a.m.