Order, sort, and find duplicates in XStringSet objects

Share:

Description

These generics order, rank, sort, and find duplicates in short read objects, including fastq-encoded qualities. srorder, srrank and srsort differ from the default functions rank, order and sort in that sorting is based on an internally-defined order rather than, e.g., the order implied by LC_COLLATE.

Usage

1
2
3
4

Arguments

x

The object to be sorted, ranked, ordered, or to have duplicates identified; see the examples below for objects for which methods are defined.

...

Additional arguments available for use by methods; usually ignored.

Details

Unlike sort and friends, the implementation does not preserve order of duplicated elements. Like duplicated, one element in each set of duplicates is marked as FALSE.

srrank settles ties using the “min” criterion described in rank, i.e., identical elements are ranked equal to the rank of the first occurrence of the sorted element.

The following methods are defined, in addition to methods described in class-specific documentation:

srsort

signature(x = "XStringSet"):

srorder

signature(x = "XStringSet"):

srduplicated

signature(x = "XStringSet"):

Apply srorder, srrank, srsort, srduplicated to XStringSet objects such as those returned by sread.

srsort

signature(x = "ShortRead"):

srorder

signature(x = "ShortRead"):

srduplicated

signature(x = "ShortRead"):

Apply srorder, srrank, srsort, srduplicated to XStringSet objects to the sread component of ShortRead and derived objects.

Value

The functions return the following values:

srorder

An integer vector the same length as x, containing the indices that will bring x into sorted order.

srrank

An integer vector the same length as x, containing the rank of each seqeunce when sorted.

srsort

An instance of x in sorted order.

srduplicated

A logical vector the same length as x indicating whether the indexed element is already present. Note that, like duplicated, subsetting x using the result returned by !srduplicated(x) includes one representative from each set of duplicates.

Author(s)

Martin Morgan <mtmorgan@fhcrc.org>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
showMethods("srsort")
showMethods("srorder")
showMethods("srduplicated")

sp <- SolexaPath(system.file('extdata', package='ShortRead'))
rfq <- readFastq(analysisPath(sp), pattern="s_1_sequence.txt")

sum(srduplicated(sread(rfq)))
srsort(sread(rfq))
srsort(quality(rfq))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.