| mixedSort | R Documentation |
sort alphanumeric values keeping numeric values in proper order
mixedSort(
x,
blanksFirst = TRUE,
na.last = NAlast,
keepNegative = FALSE,
keepInfinite = FALSE,
keepDecimal = FALSE,
ignore.case = TRUE,
useCaseTiebreak = TRUE,
honorFactor = FALSE,
sortByName = FALSE,
verbose = FALSE,
NAlast = TRUE,
...
)
x |
|
blanksFirst |
|
na.last |
|
keepNegative |
|
keepInfinite |
|
keepDecimal |
|
ignore.case |
|
useCaseTiebreak |
|
honorFactor |
|
sortByName |
|
verbose |
|
NAlast |
|
... |
additional parameters are sent to |
This function is a refactor of gtools mixedsort(), a clever bit of
R coding from the gtools package. It was extended to make it slightly
faster, and to handle special cases slightly differently.
It was driven by the need to sort gene symbols, miRNA symbols, chromosome
names, all with proper numeric order, for example:
miR-12,miR-1,miR-122,miR-1b,mir-1a
miR-122,miR-12,miR-1,miR-1a,mir-1b
miR-1,miR-1a,miR-1b,miR-12,miR-122
The function does not by default recognize negative numbers as negative,
instead it treats '-' as a delimiter, unless keepNegative=TRUE.
This function also attempts to maintain '.' as part of a decimal number, which can be problematic when sorting IP addresses, for example.
This function is really just a wrapper function for mixedOrder(),
which does the work of defining the appropriate order.
The sort logic is roughly as follows:
Split each term into alternating chunks containing character
or numeric substrings, split across columns in a matrix.
Apply appropriate ignore.case logic to the character substrings,
effectively applying toupper() on substrings
Define rank order of character substrings in each matrix column, maintaining ties to be resolved in subsequent columns.
Convert character to numeric ranks via factor intermediate,
defined higher than the highest numeric substring value.
When ignore.case=TRUE and useCaseTiebreak=TRUE, an additional
tiebreaker column is defined using the character substring values
without applying toupper().
A final tiebreaker column is the input string itself, with toupper()
applied when ignore.case=TRUE.
Apply order across all substring columns.
Therefore, some expected behaviors:
When ignore.case=TRUE and useCaseTiebreak=TRUE (default for both)
the input data is ordered without regard to case, then the tiebreaker
applies case-specific sort criteria to the final product. This logic
is very close to default sort() except for the handling of internal
numeric values inside each string.
vector of values from argument x, ordered by
mixedOrder(). The output class should match class(x).
Other jam sort functions:
mixedOrder(),
mixedSortDF(),
mixedSorts(),
mmixedOrder()
x <- c("miR-12","miR-1","miR-122","miR-1b", "miR-1a", "miR-2");
sort(x);
mixedSort(x);
# test honorFactor
mixedSort(factor(c("Cnot9", "Cnot8", "Cnot10")))
mixedSort(factor(c("Cnot9", "Cnot8", "Cnot10")), honorFactor=TRUE)
# test ignore.case
mixedSort(factor(c("Cnot9", "Cnot8", "CNOT9", "Cnot10")))
mixedSort(factor(c("CNOT9", "Cnot8", "Cnot9", "Cnot10")))
mixedSort(factor(c("Cnot9", "Cnot8", "CNOT9", "Cnot10")), ignore.case=FALSE)
mixedSort(factor(c("Cnot9", "Cnot8", "CNOT9", "Cnot10")), ignore.case=TRUE)
mixedSort(factor(c("Cnot9", "Cnot8", "CNOT9", "Cnot10")), useCaseTiebreak=TRUE)
mixedSort(factor(c("CNOT9", "Cnot8", "Cnot9", "Cnot10")), useCaseTiebreak=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.