hierarchical_coverage_term: Hierarchical Coverage of Terms

Description Usage Arguments Value Author(s) See Also Examples

View source: R/hierarchical_coverage_term.R

Description

The unique coverage of a text vector by a term after partitioning out the elements matched by previous terms.

Usage

1
2
3
4
5
6
7
8
hierarchical_coverage_term(
  text.var,
  terms,
  bound = TRUE,
  ignore.case = TRUE,
  sort = FALSE,
  ...
)

Arguments

text.var

A text vector (vector of strings).

terms

A vector of regular expressions to match against x.

bound

logical. If TRUE the terms are bound with boundary markers to ensure "read" matches "read" but not "ready").

ignore.case

logical. Should case be ignored in matching the terms against x?

sort

logical. If TRUE the output is sorted by highest unique gain. If FALSE order of term input is retained.

...

ignored.

Value

Returns a data.frame with 3 columns:

terms

the search term

unique

the unique coverage of the term

cumulative

the cumulative coverage of the term

Author(s)

Steve T. Simpson and Tyler Rinker <tyler.rinker@gmail.com>.

See Also

Other hierarchical_coverage functions: hierarchical_coverage_regex()

Examples

1
2
3
4
5
6
7
8
x <- presidential_debates_2012[["dialogue"]]
terms <- frequent_terms(x)[[1]]
(out <- hierarchical_coverage_term(x, terms))
plot(out)

(out2 <- hierarchical_coverage_term(x, frequent_terms(x, 30)[[1]]))
plot(out2, use.terms = TRUE)
plot(out2, use.terms = TRUE, mark.one = TRUE)

trinker/termco documentation built on Jan. 7, 2022, 3:32 a.m.