stri_unique: Extract Unique Elements

View source: R/sort.R

stri_uniqueR Documentation

Extract Unique Elements

Description

This function returns a character vector like str, but with duplicate elements removed.

Usage

stri_unique(str, ..., opts_collator = NULL)

Arguments

str

a character vector

...

additional settings for opts_collator

opts_collator

a named list with ICU Collator's options, see stri_opts_collator, NULL for default collation options

Details

As usual in stringi, no attributes are copied. Unlike unique, this function tests for canonical equivalence of strings (and not whether the strings are just bytewise equal). Such an operation is locale-dependent. Hence, stri_unique is significantly slower (but much better suited for natural language processing) than its base R counterpart.

See also stri_duplicated for indicating non-unique elements.

Value

Returns a character vector.

Author(s)

Marek Gagolewski and other contributors

References

Collation - ICU User Guide, https://unicode-org.github.io/icu/userguide/collation/

See Also

The official online manual of stringi at https://stringi.gagolewski.com/

Gagolewski M., stringi: Fast and portable character string processing in R, Journal of Statistical Software 103(2), 2022, 1-59, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v103.i02")}

Other locale_sensitive: %s<%(), about_locale, about_search_boundaries, about_search_coll, stri_compare(), stri_count_boundaries(), stri_duplicated(), stri_enc_detect2(), stri_extract_all_boundaries(), stri_locate_all_boundaries(), stri_opts_collator(), stri_order(), stri_rank(), stri_sort_key(), stri_sort(), stri_split_boundaries(), stri_trans_tolower(), stri_wrap()

Examples

# normalized and non-Unicode-normalized version of the same code point:
stri_unique(c('\u0105', stri_trans_nfkd('\u0105')))
unique(c('\u0105', stri_trans_nfkd('\u0105')))

stri_unique(c('gro\u00df', 'GROSS', 'Gro\u00df', 'Gross'), strength=1)


stringi documentation built on May 29, 2024, 8:16 a.m.

Related to stri_unique in stringi...