object2id | R Documentation |
Developer function to match patterns in quanteda objects against token types.
object2id(
x,
types,
valuetype = c("glob", "fixed", "regex"),
case_insensitive = TRUE,
concatenator = "_",
levels = 1,
remove_unigram = FALSE,
keep_nomatch = FALSE
)
object2fixed(
x,
types,
valuetype = c("glob", "fixed", "regex"),
case_insensitive = TRUE,
concatenator = "_",
levels = 1,
remove_unigram = FALSE,
keep_nomatch = FALSE
)
x |
a list of character vectors, dictionary or collocations object |
types |
token types against which patterns are matched |
valuetype |
the type of pattern matching: |
case_insensitive |
logical; if |
concatenator |
the concatenation character that joins multi-word
expression in |
levels |
integers specifying the levels of entries in a hierarchical
dictionary that will be applied. The top level is 1, and subsequent levels
describe lower nesting levels. Values may be combined, even if these
levels are not contiguous, e.g. |
remove_unigram |
if |
keep_nomatch |
keep patterns that did not match |
object2fixed()
returns a list of character vectors of matched
types. object2id()
returns a list of indices of matched types with
attributes. The "pattern" attribute records the indices of the matched patterns
in x
; the "key" attribute records the keys of the matched patterns when x
is
dictionary.
pattern2id()
types <- c("A", "AA", "B", "BB", "B_B", "C", "C-C")
# dictionary
dict <- dictionary(list(A = c("a", "aa"),
B = c("BB", "B B"),
C = c("C", "C-C")))
object2fixed(dict, types)
object2fixed(dict, types, remove_unigram = TRUE)
# phrase
pats <- phrase(c("a", "aa", "zz", "bb", "b b"))
object2fixed(pats, types)
object2fixed(pats, types, keep_nomatch = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.