xtfrm2 | R Documentation |
The sort
method for objects of class character
(sort.character
) uses the locale-sensitive Unicode collation
algorithm to arrange strings in a vector with regards to a
chosen lexicographic order.
xtfrm2
and [DEPRECATED] xtfrm
generate an integer vector
that sort in the same way as its input, and hence can be used
in conjunction with order
or rank
.
xtfrm2(x, ...)
## Default S3 method:
xtfrm2(x, ...)
## S3 method for class 'character'
xtfrm2(
x,
...,
locale = NULL,
strength = 3L,
alternate_shifted = FALSE,
french = FALSE,
uppercase_first = NA,
case_level = FALSE,
normalisation = FALSE,
numeric = FALSE
)
xtfrm(x)
## Default S3 method:
xtfrm(x)
## S3 method for class 'character'
xtfrm(x)
## S3 method for class 'character'
sort(
x,
...,
decreasing = FALSE,
na.last = NA,
locale = NULL,
strength = 3L,
alternate_shifted = FALSE,
french = FALSE,
uppercase_first = NA,
case_level = FALSE,
normalisation = FALSE,
numeric = FALSE
)
x |
character vector whose elements are to be sorted |
... |
further arguments passed to other methods |
locale |
|
strength |
see |
alternate_shifted |
see |
french |
see |
uppercase_first |
see |
case_level |
see |
normalisation |
see |
numeric |
see |
decreasing |
single logical value; if |
na.last |
single logical value; if |
What 'xtfrm' stands for the current author does not know, but would appreciate someone's enlightening him.
sort.character
returns a character vector, with only
the names
attribute preserved. Note that the output vector
may be shorter than the input one.
xtfrm2.character
and xtfrm.character
return an integer vector;
most attributes are preserved.
Replacements for the default S3 methods sort
and xtfrm
for character vectors
implemented with stri_sort
and stri_rank
.
Collation in different locales is difficult and non-portable across platforms [fixed here – using services provided by ICU]
Overloading xtfrm.character
has no effect in R, because S3
method dispatch is done internally with hard-coded support for
character arguments. Thus, we needed to replace the generic
xtfrm
with the one that calls UseMethod
[fixed here]
xtfrm
does not support customisation of the linear ordering
relation it is based upon
[fixed by introducing ...
argument to the new
generic, xtfrm2
]
Neither order
, rank
, nor
sort.list
is a generic, therefore
they should have to be rewritten from scratch to allow the inclusion of
our patches; interestingly, order
even calls xtfrm
,
but only for classed objects
[not fixed here – see Examples for a workaround]
xtfrm
for objects of type character
does not preserve the names attribute (but does so for numeric
)
[fixed here]
sort
seems to preserve only the names attribute
which makes sense if na.last
is NA
, because the resulting
vector might be shorter
[not fixed here as it would break compatibility with other
sorting methods]
Note that sort
by default removes missing values whatsoever,
whereas order
has na.last=TRUE
[not fixed here as it would break compatibility with other
sorting methods]
The official online manual of stringx at https://stringx.gagolewski.com/
Related function(s): strcoll
x <- c("a1", "a100", "a101", "a1000", "a10", "a10", "a11", "a99", "a10", "a1")
base::sort.default(x) # lexicographic sort
sort(x, numeric=TRUE) # calls stringx:::sort.character
xtfrm2(x, numeric=TRUE) # calls stringx:::xtfrm2.character
rank(xtfrm2(x, numeric=TRUE), ties.method="average") # ranks with averaged ties
order(xtfrm2(x, numeric=TRUE)) # ordering permutation
x[order(xtfrm2(x, numeric=TRUE))] # equivalent to sort()
# order a data frame w.r.t. decreasing ids and increasing vals
d <- data.frame(vals=round(runif(length(x)), 1), ids=x)
d[order(-xtfrm2(d[["ids"]], numeric=TRUE), d[["vals"]]), ]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.