stringi-search-charclass: Character Classes in 'stringi'

Description Details Unicode General Categories Unicode Binary Properties References See Also

Description

In this man page we describe how character classes are declared in the stringi package so that you may search for their occurrences in your search activities.

Details

All stri_*_charclass functions in stringi perform a single character (i.e. Unicode codepoint) search-based operations.

There are two separate ways to specify character classes in stringi:

Both of them provide access to the ICU's Unicode Character Database and are described in detail in the sections below.

Additionally, each class identifier may be preceded with '^', which is a way to request for a complement of a given character class, i.e. it is used to match characters not in a class.

Please note that some classes may seem to overlap. However, e.g. General Category Z (some space) and Binary Property WHITE_SPACE matches different character sets.

Unicode General Categories

The Unicode General Category property of a code point provides the most general classification of that code point. Each code point falls into one and only on Category.

Unicode Binary Properties

Binary properties identifiers are matched case-insensitively, and are slightly normalized. Each character may follow many Binary Properties at a time.

Here is the complete list of supported Binary Properties:

References

The Unicode Character Database – Unicode Standard Annex #44, http://www.unicode.org/reports/tr44/

See Also

Other search_charclass: stri_count_charclass; stri_detect_charclass; stri_extract_all_charclass, stri_extract_all_charclass, stri_extract_first_charclass, stri_extract_first_charclass, stri_extract_last_charclass, stri_extract_last_charclass; stri_locate_all_charclass, stri_locate_all_charclass, stri_locate_first_charclass, stri_locate_first_charclass, stri_locate_last_charclass, stri_locate_last_charclass; stri_replace_all_charclass, stri_replace_all_charclass, stri_replace_first_charclass, stri_replace_first_charclass, stri_replace_last_charclass, stri_replace_last_charclass; stri_split_charclass, stri_split_charclass; stri_trim, stri_trim, stri_trim_both, stri_trim_left, stri_trim_right; stringi-search

Other stringi_general_topics: stringi-arguments; stringi-encoding; stringi-locale; stringi-package; stringi-search-fixed; stringi-search-regex; stringi-search


stringi documentation built on May 2, 2019, 4:54 p.m.