hierarchical_coverage_term: Hierarchical Coverage of Terms
In trinker/termco: Counts of Terms and Substrings

Description Usage Arguments Value Author(s) See Also Examples

View source: R/hierarchical_coverage_term.R

The unique coverage of a text vector by a term after partitioning out the elements matched by previous terms.

hierarchical_coverage_term(
  text.var,
  terms,
  bound = TRUE,
  ignore.case = TRUE,
  sort = FALSE,
  ...
)

`text.var`	A text vector (vector of strings).
`terms`	A vector of regular expressions to match against `x`.
`bound`	logical. If `TRUE` the terms are bound with boundary markers to ensure `"read"` matches `"read"` but not `"ready"`).
`ignore.case`	logical. Should case be ignored in matching the `terms` against `x`?
`sort`	logical. If `TRUE` the output is sorted by highest unique gain. If `FALSE` order of term input is retained.
`...`	ignored.

Returns a data.frame with 3 columns:

terms: the search term
unique: the unique coverage of the term
cumulative: the cumulative coverage of the term

Steve T. Simpson and Tyler Rinker <tyler.rinker@gmail.com>.

Other hierarchical_coverage functions: hierarchical_coverage_regex()

x <- presidential_debates_2012[["dialogue"]]
terms <- frequent_terms(x)[[1]]
(out <- hierarchical_coverage_term(x, terms))
plot(out)

(out2 <- hierarchical_coverage_term(x, frequent_terms(x, 30)[[1]]))
plot(out2, use.terms = TRUE)
plot(out2, use.terms = TRUE, mark.one = TRUE)