sortSamples: Sort biological sample labels for experimental design

sortSamplesR Documentation

Sort biological sample labels for experimental design

Description

Sort biological sample labels for experimental design

Usage

sortSamples(
  x,
  controlTerms = c("WT|wildtype", "(^|[-_ ])(NT|NTC)($|[-_ ]|[0-9])", "ETOH",
    "control|ctrl|ctl", "Vehicle|veh", "none|empty|blank", "scramble", "ttx", "PBS",
    "knockout", "mutant"),
  sortFunc = jamba::mixedSort,
  preControlTerms = NULL,
  postControlTerms = NULL,
  ignore.case = TRUE,
  boundary = TRUE,
  perl = boundary,
  keepFactorsAsIs = TRUE,
  ...
)

Arguments

x

character vector or factor

controlTerms

vector of regular expression patterns used to determine control terms, where the patterns are matched and returned in order.

preControlTerms

vector or NULL, optional control terms or regular expressions to use before the controlTerms above. This argument is used as a convenient prefix to the default terms.

postControlTerms

vector or NULL, optional control terms or regular expressions to use after the controlTerms above. This argument is used as a convenient suffix to the default terms.

ignore.case

logical passed to jamba::provigrep() indicating whether to ignore case-sensitive matching.

boundary

logical indicating whether to require a word boundary at either the start or end of the control terms. When TRUE, it uses perl=TRUE by default, and allows either perl boundary or an underscore "_".

perl

logical indicating whether to use Perl regular expression pattern matching.

keepFactorsAsIs

logical indicating whether to maintain factor level order, if x is supplied as a factor. If keepFactorsAsIs==TRUE then only sort(x) is returned.

...

additional arguments are ignored.

Details

This function sorts a vector of sample labels using typical heuristics that order typical control groups terms before test groups. For example, "Vehicle" would be returned before "Treatment" since "Vehicle" is a recognized control term.

It also employs jamba::mixedSort() for proper alphanumeric sorting, for example so "Time_5hr" would be sorted before "Time_12hr".

Value

character vector ordered such that control terms are preferentially first before non-control terms.

See Also

Other jam string functions: escapeWhitespaceRegexp(), strsplitOrdered()

Other jam RNA-seq functions: assignGRLexonNames(), closestExonToJunctions(), combineGRcoverage(), defineDetectedTx(), detectedTxInfo(), exoncov2polygon(), flattenExonsBy(), getGRcoverageFromBw(), groups2contrasts(), internal_junc_score(), makeTx2geneFromGtf(), make_ref2compressed(), prepareSashimi(), runDiffSplice(), spliceGR2junctionDF()

Examples

# the defaults perform well for clear descriptors
sortSamples(c("Trt_12h", "Trt_9h", "Trt_1h", "Trt_9h", "Vehicle"));

# custom terms can be added before the usual control terms
sortSamples(c("Trt_12h", "Trt_9h", "Trt_1h", "Trt_9h", "Fixated", "Vehicle"),
   preControlTerms="fixate");

# custom terms can be added after the usual control terms
sortSamples(c("Trt_12h", "Trt_9h", "Trt_1h", "Trt_9h", "Fixated", "Vehicle"),
   postControlTerms="fixate");


jmw86069/splicejam documentation built on April 21, 2024, 4:57 p.m.