valuetype | R Documentation |
Pattern matching in quanteda using the valuetype
argument.
valuetype |
the type of pattern matching: |
case_insensitive |
logical; if |
Pattern matching in in quanteda uses "glob"-style pattern
matching as the default, because this is simpler than regular expression
matching while addressing most users' needs. It is also has the advantage
of being identical to fixed pattern matching when the wildcard characters
(*
and ?
) are not used. Finally, most dictionary formats use glob
matching.
"glob"
"glob"-style wildcard expressions, the quanteda
default. The implementation used in quanteda uses *
to match any
number of any characters including none, and ?
to match any single
character. See also utils::glob2rx()
and References below.
"regex"
Regular expression matching.
"fixed"
Fixed (literal) pattern matching.
If "fixed" is used with case_insensitive = TRUE
, features will
typically be lowercased internally prior to matching. Also, glob matches
are converted to regular expressions (using utils::glob2rx()
) when
they contain wild card characters, and to fixed pattern matches when they
do not.
utils::glob2rx()
, glob pattern matching (Wikipedia),
stringi::stringi-search-regex()
, stringi::stringi-search-fixed()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.