View source: R/tokens_annotate.R
| tokens_annotate | R Documentation |
Insert dictionary keys as tags in a tokens object where the dictionary patterns are found.
tokens_annotate(
x,
dictionary,
levels = 1:5,
valuetype = c("glob", "regex", "fixed"),
case_insensitive = TRUE,
marker = "#",
capkeys = TRUE,
nested_scope = c("key", "dictionary"),
apply_if = NULL,
verbose = quanteda_options("verbose")
)
x |
the tokens object to which the dictionary will be applied |
dictionary |
the dictionary-class object that will be applied to
|
levels |
integers specifying the levels of entries in a hierarchical
dictionary that will be applied. The top level is 1, and subsequent levels
describe lower nesting levels. Values may be combined, even if these
levels are not contiguous, e.g. |
valuetype |
the type of pattern matching: |
case_insensitive |
logical; if |
marker |
characters that are added before and after the dictionary keys to create tags. |
capkeys |
if |
nested_scope |
how to treat matches from different dictionary keys that
are nested. When one value is nested within another, such as "a b" being
nested within "a b c", then |
apply_if |
logical vector of length |
verbose |
if |
tokens_lookup
txt <- c(d1 = "The United States has the Atlantic Ocean and the Pacific Ocean.",
d2 = "Britain and Ireland have the Irish Sea and the English Channel.")
toks <- tokens(txt)
dict <- dictionary(list(US = list(Countries = c("States"),
oceans = c("Atlantic", "Pacific")),
Europe = list(Countries = c("Britain", "Ireland"),
oceans = list(west = "Irish Sea",
east = "English Channel"))))
tokens_annotate(toks, dict)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.