stri_opts_collator: Generate a List with Collator Options

Description Usage Arguments Details Value References See Also Examples

Description

A convenience function to tune the Collator's behavior, e.g. in stri_compare, stri_order, stri_detect_fixed, and other stringi-search-fixed functions.

Usage

1
2
3
stri_opts_collator(locale = NULL, strength = 3L,
  alternate_shifted = FALSE, french = FALSE, uppercase_first = NA,
  case_level = FALSE, normalization = FALSE, numeric = FALSE)

Arguments

locale

single string, NULL or "" for default locale

strength

single integer in {1,2,3,4}, which defines collation strength; 1 for the most permissive collation rules, 4 for the most strict ones

alternate_shifted

single logical value; FALSE treats all the code points with non-ignorable primary weights in the same way, TRUE causes code points with primary weights that are equal or below the variable top value to be ignored on primary level and moved to the quaternary level

french

single logical value; used in Canadian French; TRUE results in secondary weights being considered backwards

uppercase_first

single logical value; NA orders upper and lower case letters in accordance to their tertiary weights, TRUE forces upper case letters to sort before lower case letters, FALSE does the opposite

case_level

single logical value; controls whether an extra case level (positioned before the third level) is generated or not

normalization

single logical value; if TRUE, then incremental check is performed to see whether the input data is in the FCD form. If the data is not in the FCD form, incremental NFD normalization is performed

numeric

single logical value; when turned on, this attribute generates a collation key for the numeric value of substrings of digits; This is a way to get '100' to sort AFTER '2'.

Details

ICU's collator performs a locale-aware, natural-language alike string comparison. This is a more reliable way of establishing relationships between string than that provided by base R, and definitely one that is more complex than ordinary byte-comparison.

A note on collation strength: generally, strength set to 4 is the least permissive. Set to 2 to ignore case differences. Set to 1 to also ignore diacritical differences.

The strings are Unicode-normalized before the comparison.

Value

Returns a named list object; missing options are left with default values.

References

Collation – ICU User Guide, http://userguide.icu-project.org/collation

ICU Collation Service Architecture – ICU User Guide, http://userguide.icu-project.org/collation/architecture

icu::Collator Class Reference – ICU4C API Documentation, http://www.icu-project.org/apiref/icu4c/classicu_1_1Collator.html

See Also

Other locale_sensitive: stri_cmp, stri_compare; stri_count_fixed; stri_detect_fixed; stri_enc_detect2; stri_locate_all_fixed, stri_locate_all_fixed,, stri_locate_first_fixed, stri_locate_first_fixed,, stri_locate_last_fixed, stri_locate_last_fixed; stri_order, stri_sort; stri_replace_all_fixed, stri_replace_all_fixed, stri_replace_first_fixed, stri_replace_first_fixed, stri_replace_last_fixed, stri_replace_last_fixed; stri_split_fixed, stri_split_fixed; stri_trans_tolower, stri_trans_totitle, stri_trans_toupper; stringi-locale; stringi-search-fixed

Other search_fixed: stri_count_fixed; stri_detect_fixed; stri_extract_all_fixed, stri_extract_all_fixed,, stri_extract_first_fixed, stri_extract_first_fixed,, stri_extract_last_fixed, stri_extract_last_fixed; stri_locate_all_fixed, stri_locate_all_fixed,, stri_locate_first_fixed, stri_locate_first_fixed,, stri_locate_last_fixed, stri_locate_last_fixed; stri_replace_all_fixed, stri_replace_all_fixed, stri_replace_first_fixed, stri_replace_first_fixed, stri_replace_last_fixed, stri_replace_last_fixed; stri_split_fixed, stri_split_fixed; stringi-search-fixed; stringi-search

Examples

1
2
3
4
5
6
## Not run: 
stri_cmp("zupa100", "zupa2") != stri_cmp("zupa100", "zupa2", stri_opts_collator(numeric=TRUE))
stri_cmp("above mentioned", "above-mentioned")
stri_cmp("above mentioned", "above-mentioned", stri_opts_collator(alternate_shifted=TRUE))

## End(Not run)

Example output

[1] TRUE
[1] -1
[1] -1

stringi documentation built on May 2, 2019, 4:54 p.m.