stringi-search-regex: Regular Expressions in 'stringi'

Description Details Regexes in stringi References See Also

Description

A regular expression is a pattern describing, possibly in a very abstract way, a part of text. Thanks to many regex functions in stringi, regular expressions may be a very powerful tool in your hand to do string searching, substring extraction, string splitting, etc.

Details

All stri_*_regex functions in stringi use the ICU regex engine, which settings may be tuned up (for example to perform case-insensitive search) with the stri_opts_regex function.

Regular expression patterns in ICU are quite similar in form and behavior to Perl's regexes. Their implementation loosely bases on JDK 1.4 package java.util.regex. ICU Regular Expressions conform to the Unicode Technical Standard #18 (see References section) and its features are summarized in the ICU User Guide (see below). A good general introduction to regexes is (Friedl, 2002). Some topics are also covered in the R manual, see regex.

Regexes in stringi

Note that if a given regex pattern is empty, then all functions in stringi give NA in result and generate a warning. On syntax error, a quite informative failure message is shown.

If you would like to search for a fixed pattern, refer to stringi-search-fixed. This allows to do a locale-aware text lookup, or a very fast exact-byte search.

References

Regular expressions – ICU User Guide, http://userguide.icu-project.org/strings/regexp

J.E.F. Friedl, Mastering Regular Expressions, O'Reilly, 2002

Unicode Regular Expressions – Unicode Technical Standard #18, http://www.unicode.org/reports/tr18/

Unicode Regular Expressions – Regex tutorial, http://www.regular-expressions.info/unicode.html

See Also

Other search_regex: stri_count_regex; stri_detect_regex; stri_extract_all_regex, stri_extract_all_regex, stri_extract_first_regex, stri_extract_first_regex, stri_extract_last_regex, stri_extract_last_regex; stri_locate_all_regex, stri_locate_all_regex, stri_locate_first_regex, stri_locate_first_regex, stri_locate_last_regex, stri_locate_last_regex; stri_match_all_regex, stri_match_all_regex, stri_match_first_regex, stri_match_first_regex, stri_match_last_regex, stri_match_last_regex; stri_opts_regex; stri_replace_all_regex, stri_replace_all_regex, stri_replace_first_regex, stri_replace_first_regex, stri_replace_last_regex, stri_replace_last_regex; stri_split_regex, stri_split_regex; stringi-search

Other stringi_general_topics: stringi-arguments; stringi-encoding; stringi-locale; stringi-package; stringi-search-charclass; stringi-search-fixed; stringi-search


stringi documentation built on May 2, 2019, 4:54 p.m.