# Vector-comparison: Compare, order, tabulate vector-like objects

## Description

Generic functions and methods for comparing, ordering, and tabulating vector-like objects.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88``` ```## Element-wise (aka "parallel") comparison of 2 Vector objects ## ------------------------------------------------------------ pcompare(x, y) ## S4 method for signature 'Vector,Vector' e1 == e2 ## S4 method for signature 'Vector,ANY' e1 == e2 ## S4 method for signature 'ANY,Vector' e1 == e2 ## S4 method for signature 'Vector,Vector' e1 <= e2 ## S4 method for signature 'Vector,ANY' e1 <= e2 ## S4 method for signature 'ANY,Vector' e1 <= e2 ## S4 method for signature 'Vector,Vector' e1 != e2 ## S4 method for signature 'Vector,ANY' e1 != e2 ## S4 method for signature 'ANY,Vector' e1 != e2 ## S4 method for signature 'Vector,Vector' e1 >= e2 ## S4 method for signature 'Vector,ANY' e1 >= e2 ## S4 method for signature 'ANY,Vector' e1 >= e2 ## S4 method for signature 'Vector,Vector' e1 < e2 ## S4 method for signature 'Vector,ANY' e1 < e2 ## S4 method for signature 'ANY,Vector' e1 < e2 ## S4 method for signature 'Vector,Vector' e1 > e2 ## S4 method for signature 'Vector,ANY' e1 > e2 ## S4 method for signature 'ANY,Vector' e1 > e2 ## selfmatch() ## ----------- selfmatch(x, ...) ## duplicated() & unique() ## ----------------------- ## S4 method for signature 'Vector' duplicated(x, incomparables=FALSE, ...) ## S4 method for signature 'Vector' unique(x, incomparables=FALSE, ...) ## %in% ## ---- ## S4 method for signature 'Vector,Vector' x %in% table ## S4 method for signature 'Vector,ANY' x %in% table ## S4 method for signature 'ANY,Vector' x %in% table ## findMatches() & countMatches() ## ------------------------------ findMatches(x, table, select=c("all", "first", "last"), ...) countMatches(x, table, ...) ## sort() ## ------ ## S4 method for signature 'Vector' sort(x, decreasing=FALSE, na.last = NA, by) ## table() ## ------- ## S4 method for signature 'Vector' table(...) ```

## Arguments

 `x, y, e1, e2, table` Vector-like objects. `incomparables` The `duplicated` method for Vector objects does NOT support this argument. The `unique` method for Vector objects, which is implemented on top of `duplicated`, propagates this argument to its call to `duplicated`. See `?base::duplicated` and `?base::unique` for more information about this argument. `select` Only `select="all"` is supported at the moment. Note that you can use `match` if you want to do `select="first"`. Otherwise you're welcome to request this on the Bioconductor mailing list. `decreasing, na.last` See `?base::sort`. `by` A formula referencing the metadata columns by which to sort, e.g., `~ x + y` sorts by column “x”, breaking ties with column “y”. `...` A Vector object for `table` (the `table` method for Vector objects can only take one input object). Otherwise, extra arguments supported by specific methods. In particular: The default `selfmatch` method, which is implemented on top of `match`, propagates the extra arguments to its call to `match`. The `duplicated` method for Vector objects, which is implemented on top of `selfmatch`, accepts extra argument `fromLast` and propagates the other extra arguments to its call to `selfmatch`. See `?base::duplicated` for more information about this argument. The `unique` method for Vector objects, which is implemented on top of `duplicated`, propagates the extra arguments to its call to `duplicated`. The default `findMatches` and `countMatches` methods, which are implemented on top of `match` and `selfmatch`, propagate the extra arguments to their calls to `match` and `selfmatch`. The `sort` method for Vector objects, which is implemented on top of `order`, only accepts extra argument `na.last` and propagates it to its call to `order`.

## Details

Doing `pcompare(x, y)` on 2 vector-like objects `x` and `y` of length 1 must return an integer less than, equal to, or greater than zero if the single element in `x` is considered to be respectively less than, equal to, or greater than the single element in `y`. If `x` or `y` have a length != 1, then they are typically expected to have the same length so `pcompare(x, y)` can operate element-wise, that is, in that case it returns an integer vector of the same length as `x` and `y` where the i-th element is the result of compairing `x[i]` and `y[i]`. If `x` and `y` don't have the same length and are not zero-length vectors, then the shortest is first recycled to the length of the longest. If one of them is a zero-length vector then `pcompare(x, y)` returns a zero-length integer vector.

`selfmatch(x, ...)` is equivalent to `match(x, x, ...)`. This is actually how the default method is implemented. However note that `selfmatch(x, ...)` will typically be more efficient than `match(x, x, ...)` on vector-like objects for which a specific `selfmatch` method is implemented.

`findMatches` is an enhanced version of `match` which, by default (i.e. if `select="all"`), returns all the matches in a Hits object.

`countMatches` returns an integer vector of the length of `x` containing the number of matches in `table` for each element in `x`.

## Value

For `pcompare`: see Details section above.

For `selfmatch`: an integer vector of the same length as `x`.

For `duplicated`, `unique`, and `%in%`: see `?BiocGenerics::duplicated`, `?BiocGenerics::unique`, and `?`%in%``.

For `findMatches`: a Hits object by default (i.e. if `select="all"`).

For `countMatches`: an integer vector of the length of `x` containing the number of matches in `table` for each element in `x`.

For `sort`: see `?BiocGenerics::sort`.

For `table`: a 1D array of integer values promoted to the `"table"` class. See `?BiocGeneric::table` for more information.

## Note

The following notes are for developers who want to implement comparing, ordering, and tabulating methods for their own Vector subclass:

1. The 6 traditional binary comparison operators are: `==`, `!=`, `<=`, `>=`, `<`, and `>`. The S4Vectors package provides the following methods for these operators:

```setMethod("==", c("Vector", "Vector"),
function(e1, e2) { pcompare(e1, e2) == 0L }
)
setMethod("<=", c("Vector", "Vector"),
function(e1, e2) { pcompare(e1, e2) <= 0L }
)
setMethod("!=", c("Vector", "Vector"),
function(e1, e2) { !(e1 == e2) }
)
setMethod(">=", c("Vector", "Vector"),
function(e1, e2) { e2 <= e1 }
)
setMethod("<", c("Vector", "Vector"),
function(e1, e2) { !(e2 <= e1) }
)
setMethod(">", c("Vector", "Vector"),
function(e1, e2) { !(e1 <= e2) }
)
```

With these definitions, the 6 binary operators work out-of-the-box on Vector objects for which `pcompare` works the expected way. If `pcompare` is not implemented, then it's enough to implement `==` and `<=` methods to have the 4 remaining operators (`!=`, `>=`, `<`, and `>`) work out-of-the-box.

2. The S4Vectors package provides no `pcompare` method for Vector objects. Specific `pcompare` methods need to be implemented for specific Vector subclasses (e.g. for Hits and Ranges objects). These specific methods must obey the rules described in the Details section above.

3. The `duplicated`, `unique`, and `%in%` methods for Vector objects are implemented on top of `selfmatch`, `duplicated`, and `match`, respectively, so they work out-of-the-box on Vector objects for which `selfmatch`, `duplicated`, and `match` work the expected way.

4. Also the default `findMatches` and `countMatches` methods are implemented on top of `match` and `selfmatch` so they work out-of-the-box on Vector objects for which those things work the expected way.

5. However, since `selfmatch` itself is also implemented on top of `match`, then having `match` work the expected way is actually enough to get `selfmatch`, `duplicated`, `unique`, `%in%`, `findMatches`, and `countMatches` work out-of-the-box on Vector objects.

6. The `sort` method for Vector objects is implemented on top of `order`, so it works out-of-the-box on Vector objects for which `order` works the expected way.

7. The `table` method for Vector objects is implemented on top of `selfmatch`, `order`, and `as.character`, so it works out-of-the-box on a Vector object for which those things work the expected way.

8. The S4Vectors package provides no `match` or `order` methods for Vector objects. Specific methods need to be implemented for specific Vector subclasses (e.g. for Hits and Ranges objects).

## Author(s)

Hervé Pagès

## See Also

• The Vector class.

• Hits-comparison for comparing and ordering hits.

• Vector-setops for set operations on vector-like objects.

• Vector-merge for merging vector-like objects.

• Ranges-comparison in the IRanges package for comparing and ordering ranges.

• `==` and `%in%` in the base package, and `BiocGenerics::match`, `BiocGenerics::duplicated`, `BiocGenerics::unique`, `BiocGenerics::order`, `BiocGenerics::sort`, `BiocGenerics::rank` in the BiocGenerics package for general information about the comparison/ordering operators and functions.

• The Hits class.

• `BiocGeneric::table` in the BiocGenerics package.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45``` ```## --------------------------------------------------------------------- ## A. SIMPLE EXAMPLES ## --------------------------------------------------------------------- y <- c(16L, -3L, -2L, 15L, 15L, 0L, 8L, 15L, -2L) selfmatch(y) x <- c(unique(y), 999L) findMatches(x, y) countMatches(x, y) ## See ?`Ranges-comparison` for more examples (on Ranges objects). You ## might need to load the IRanges package first. ## --------------------------------------------------------------------- ## B. FOR DEVELOPERS: HOW TO IMPLEMENT THE BINARY COMPARISON OPERATORS ## FOR YOUR Vector SUBCLASS ## --------------------------------------------------------------------- ## The answer is: don't implement them. Just implement pcompare() and the ## binary comparison operators will work out-of-the-box. Here is an ## example: ## (1) Implement a simple Vector subclass. setClass("Raw", contains="Vector", representation(data="raw")) setMethod("length", "Raw", function(x) length(x@data)) setMethod("[", "Raw", function(x, i, j, ..., drop) { x@data <- x@data[i]; x } ) x <- new("Raw", data=charToRaw("AB.x0a-BAA+C")) stopifnot(identical(length(x), 12L)) stopifnot(identical(x[7:3], new("Raw", data=charToRaw("-a0x.")))) ## (2) Implement a "pcompare" method for Raw objects. setMethod("pcompare", c("Raw", "Raw"), function(x, y) {as.integer(x@data) - as.integer(y@data)} ) stopifnot(identical(which(x == x[1]), c(1L, 9L, 10L))) stopifnot(identical(x[x < x[5]], new("Raw", data=charToRaw(".-+")))) ```

