mixedSorts: sort alphanumeric values within a list format

mixedSortsR Documentation

sort alphanumeric values within a list format

Description

sort alphanumeric values within a list format

Usage

mixedSorts(
  x,
  blanksFirst = TRUE,
  na.last = NAlast,
  keepNegative = FALSE,
  keepInfinite = TRUE,
  keepDecimal = FALSE,
  ignore.case = TRUE,
  useCaseTiebreak = TRUE,
  sortByName = FALSE,
  na.rm = FALSE,
  verbose = FALSE,
  NAlast = TRUE,
  honorFactor = TRUE,
  xclass = NULL,
  indent = 0,
  debug = FALSE,
  ...
)

Arguments

x

vector

blanksFirst

logical whether to order blank entries before entries containing a value.

na.last

logical indicating whether to move NA entries at the end of the sort.

keepNegative

logical whether to keep '-' associated with adjacent numeric values, in order to sort them as negative values.

keepInfinite

logical whether to allow "Inf" to be considered a numeric infinite value.

keepDecimal

logical whether to keep the decimal in numbers, sorting as a true number and not as a version number. By default keepDecimal=FALSE, which means "v1.200" should be ordered before "v1.30". When keepDecimal=TRUE, the numeric sort considers only "1.2" and "1.3" and sorts in that order.

ignore.case

logical whether to ignore uppercase and lowercase characters when defining the sort order. Note that when x is factor the factor levels are converted using unique(toupper(levels(x))), therefore the values in x will be sorted by factor level.

useCaseTiebreak

logical indicating whether to break ties when ignore.case=TRUE, using mixed case as a tiebreaker.

sortByName

logical whether to sort the vector x by names(x) instead of sorting by x itself.

verbose

logical whether to print verbose output.

NAlast

logical deprecated in favor of argument na.last for consistency with base::sort().

xclass

character vector of classes in x, used for slight optimization to re-use this vector if it has already been defined for x. When NULL it is created within this function.

indent

numeric used only when verbose=TRUE to determine the number of spaces indented for verbose output, passed to printDebug().

...

additional parameters are sent to mixedOrder.

Details

This function is an extension to mixedSort() to sort each vector in a list. It applies the sort to the whole unlisted vector then splits back into list form.

In the event the input is a nested list of lists, only the first level of list structure is maintained in the output data. For more information, see rlengths() which calculates the recursive nested list sizes. An exception is when the data contained in x represents multiple classes, see below.

When data in x represents multiple classes, for example character and factor, the mechanism is slightly different and not as well- optimized for large length x. The method uses rapply(x, how="replace", mixedSort) which recursively, and iteratively, calls mixedSort() on each vector, and therefore returns data in the same nested list structure as provided in x.

When data in x represents only one class, data is unlist() to one large vector, which is sorted with mixedSort(), then split back into list structure representing x input.

See Also

Other jam sort functions: mixedOrder(), mixedSortDF(), mixedSort(), mmixedOrder()

Other jam string functions: asSize(), breaksByVector(), cPasteSU(), cPasteS(), cPasteUnique(), cPasteU(), cPaste(), fillBlanks(), formatInt(), gsubOrdered(), gsubs(), makeNames(), mixedOrder(), mixedSortDF(), mixedSort(), mmixedOrder(), nameVectorN(), nameVector(), padInteger(), padString(), pasteByRowOrdered(), pasteByRow(), sizeAsNum(), tcount(), ucfirst(), uniques()

Other jam list functions: cPasteSU(), cPasteS(), cPasteUnique(), cPasteU(), cPaste(), heads(), jam_rapply(), list2df(), mergeAllXY(), rbindList(), relist_named(), rlengths(), sclass(), sdim(), uniques(), unnestList()

Examples

# set up an example list of mixed alpha-numeric strings
set.seed(12);
x <- paste0(sample(letters, replace=TRUE, 52), rep(1:30, length.out=52));
x;
# split into a list as an example
xL <- split(x, rep(letters[1:5], c(6,7,5,4,4)));
xL;

# now run mixedSorts(xL)
# Notice "e6" is sorted before "e30"
mixedSorts(xL)

# for fun, compare to lapply(xL, sort)
# Notice "e6" is sorted after "e30"
lapply(xL, sort)

# test super-long list
xL10k <- rep(xL, length.out=10000);
names(xL10k) <- as.character(seq_along(xL10k));
print(head(mixedSorts(xL10k), 10))

# Now make some list vectors into factors
xF <- xL;
xF$c <- factor(xL$c)
# for fun, reverse the levels
xF$c <- factor(xF$c,
   levels=rev(levels(xF$c)))
xF
mixedSorts(xF)

# test super-long list
xF10k <- rep(xF, length.out=10000);
names(xF10k) <- as.character(seq_along(xF10k));
print(head(mixedSorts(xF10k), 10))

# Make a nested list
set.seed(1);
l1 <- list(
   A=sample(nameVector(11:13, rev(letters[11:13]))),
   B=list(
      C=sample(nameVector(4:8, rev(LETTERS[4:8]))),
      D=sample(nameVector(LETTERS[2:5], rev(LETTERS[2:5])))
   )
)
l1;
# The output is a nested list with the same structure
mixedSorts(l1, verbose=TRUE);
mixedSorts(l1, sortByName=TRUE, verbose=TRUE);

# Make a nested list with two sub-lists
set.seed(1);
l2 <- list(
   A=list(
      E=sample(nameVector(11:13, rev(letters[11:13])))
   ),
   B=list(
      C=sample(nameVector(4:8, rev(LETTERS[4:8]))),
      D=sample(nameVector(LETTERS[2:5], rev(LETTERS[2:5])))
   )
)
l2;
# The output is a nested list with the same structure
mixedSorts(l2);
mixedSorts(l2, sortByName=TRUE);

# when one entry is missing
L0 <- list(A=3:1,
   B=list(C=c(1:3,NA,0),
   D=LETTERS[c(4,5,2)],
   E=NULL));
L0
mixedSorts(L0)
mixedSorts(L0, na.rm=TRUE)


jmw86069/jamba documentation built on Oct. 9, 2024, 10:52 a.m.