.devel/sphinx/rapi/sort.md

sort: Sort Strings

Description

The sort method for objects of class character (sort.character) uses the locale-sensitive Unicode collation algorithm to arrange strings in a vector with regards to a chosen lexicographic order.

xtfrm2 and [DEPRECATED] xtfrm generate an integer vector that sort in the same way as its input, and hence can be used in conjunction with order or rank.

Usage

xtfrm2(x, ...)

## Default S3 method:
xtfrm2(x, ...)

## S3 method for class 'character'
xtfrm2(
  x,
  ...,
  locale = NULL,
  strength = 3L,
  alternate_shifted = FALSE,
  french = FALSE,
  uppercase_first = NA,
  case_level = FALSE,
  normalisation = FALSE,
  numeric = FALSE
)

xtfrm(x)

## Default S3 method:
xtfrm(x)

## S3 method for class 'character'
xtfrm(x)

## S3 method for class 'character'
sort(
  x,
  ...,
  decreasing = FALSE,
  na.last = NA,
  locale = NULL,
  strength = 3L,
  alternate_shifted = FALSE,
  french = FALSE,
  uppercase_first = NA,
  case_level = FALSE,
  normalisation = FALSE,
  numeric = FALSE
)

Arguments

| | | |----|----| | x | character vector whose elements are to be sorted | | ... | further arguments passed to other methods | | locale | NULL or "" for the default locale (see stri_locale_get) or a single string with a locale identifier, see stri_locale_list | | strength | see stri_opts_collator | | alternate_shifted | see stri_opts_collator | | french | see stri_opts_collator | | uppercase_first | see stri_opts_collator | | case_level | see stri_opts_collator | | normalisation | see stri_opts_collator | | numeric | see stri_opts_collator | | decreasing | single logical value; if FALSE, the ordering is nondecreasing (weakly increasing) | | na.last | single logical value; if TRUE, then missing values are placed at the end; if FALSE, they are put at the beginning; if NA, then they are removed from the output whatsoever. |

Details

What \'xtfrm\' stands for the current author does not know, but would appreciate someone\'s enlightening him.

Value

sort.character returns a character vector, with only the names attribute preserved. Note that the output vector may be shorter than the input one.

xtfrm2.character and xtfrm.character return an integer vector; most attributes are preserved.

Differences from Base R

Replacements for the default S3 methods sort and xtfrm for character vectors implemented with stri_sort and stri_rank.

Author(s)

Marek Gagolewski

See Also

The official online manual of stringx at https://stringx.gagolewski.com/

Related function(s): strcoll

Examples

x <- c("a1", "a100", "a101", "a1000", "a10", "a10", "a11", "a99", "a10", "a1")
base::sort.default(x)   # lexicographic sort
##  [1] "a1"    "a1"    "a10"   "a10"   "a10"   "a100"  "a1000" "a101"  "a11"  
## [10] "a99"
sort(x, numeric=TRUE)   # calls stringx:::sort.character
##  [1] "a1"    "a1"    "a10"   "a10"   "a10"   "a11"   "a99"   "a100"  "a101" 
## [10] "a1000"
xtfrm2(x, numeric=TRUE)  # calls stringx:::xtfrm2.character
##  [1]  1  8  9 10  3  3  6  7  3  1
rank(xtfrm2(x, numeric=TRUE), ties.method="average")  # ranks with averaged ties
##  [1]  1.5  8.0  9.0 10.0  4.0  4.0  6.0  7.0  4.0  1.5
order(xtfrm2(x, numeric=TRUE))    # ordering permutation
##  [1]  1 10  5  6  9  7  8  2  3  4
x[order(xtfrm2(x, numeric=TRUE))] # equivalent to sort()
##  [1] "a1"    "a1"    "a10"   "a10"   "a10"   "a11"   "a99"   "a100"  "a101" 
## [10] "a1000"
# order a data frame w.r.t. decreasing ids and increasing vals
d <- data.frame(vals=round(runif(length(x)), 1), ids=x)
d[order(-xtfrm2(d[["ids"]], numeric=TRUE), d[["vals"]]), ]
##    vals   ids
## 4   0.9 a1000
## 3   0.4  a101
## 2   0.8  a100
## 8   0.9   a99
## 7   0.5   a11
## 6   0.0   a10
## 9   0.6   a10
## 5   0.9   a10
## 1   0.3    a1
## 10  0.5    a1


gagolews/stringx documentation built on Jan. 15, 2025, 9:46 p.m.