as_term_list: Coerce to Named List

Description Usage Arguments Value Examples

View source: R/as_term_list.R

Description

Convenience function to convert a data forms of terms into a named list. For vectors, names are the same as the terms.

Usage

1
as_term_list(x, add.boundary = FALSE, collapse = FALSE, test.regex = TRUE, ...)

Arguments

x

A vector of strings or a quanteda dictionary.

add.boundary

logical. If TRUE a word boundary is place at the beginning and end of the strings. Note this is ignored by as_term_list.list.

collapse

logical. If TRUE vectors of regexes are collapsed with a regex OR (|) symbol and wrapped in parenthesis.

test.regex

logical. If TRUE the regular expressions created will be tested for validity in stringi.

...

If as_term_list.dictionary2 other arguments passed to flatten), otherwise ignored.

Value

Returns a named list.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
as_term_list(state.name)

## Not run: 
if (!require("pacman")) install.packages("pacman")
pacman::p_load(tidyverse, textshape)


x <- presidential_debates_2012[["dialogue"]]

ngrams <- frequent_ngrams(x, n=10) %>%
    transmute(ngram = collocation) %>%
    unlist() %>%
    as_term_list()


presidential_debates_2012 %>%
    with(term_count(dialogue, person, ngrams))

## dictionary from quanteda
require(quanteda)
mfdict <- textreadr::download("https://provalisresearch.com/Download/LaverGarry.zip") %>%
    unzip(exdir = tempdir()) %>%
    `[`(1) %>%
    dictionary(file = .)

as_term_list(mfdict)
as_term_list(mfdict, add.boundary = TRUE)
as_term_list(mfdict, sep = ' -> ')

as_term_list(mfdict) %>%
    tidy_list('category', 'regex')

## End(Not run)

## Writing term list for non-R .json others to use:
## Not run: 
as_term_list(mfdict, TRUE) %>%
    jsonlite::toJSON(pretty=TRUE) %>%
    stringi::stri_unescape_unicode() %>%
    cat(file = 'testing.json')

## End(Not run)

trinker/termco documentation built on Jan. 7, 2022, 3:32 a.m.