Finding the nearest genomic range neighbor
Description
The nearest
, precede
, follow
, distance
and distanceToNearest
methods for GenomicRanges
objects and subclasses.
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22  ## S4 method for signature 'GenomicRanges,GenomicRanges'
precede(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,missing'
precede(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,GenomicRanges'
follow(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,missing'
follow(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,GenomicRanges'
nearest(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,missing'
nearest(x, subject, select=c("arbitrary", "all"), ignore.strand=FALSE)
## S4 method for signature 'GenomicRanges,GenomicRanges'
distanceToNearest(x, subject, ignore.strand=FALSE, ...)
## S4 method for signature 'GenomicRanges,missing'
distanceToNearest(x, subject, ignore.strand=FALSE, ...)
## S4 method for signature 'GenomicRanges,GenomicRanges'
distance(x, y, ignore.strand=FALSE, ...)

Arguments
x 
The query GenomicRanges instance. 
subject 
The subject GenomicRanges instance
within which the nearest neighbors are found. Can be missing,
in which case 
y 
For the 
select 
Logic for handling ties. By default, all methods
select a single interval (arbitrary for When 
ignore.strand 
A 
... 
Additional arguments for methods. 
Details
nearest: Performs conventional nearest neighbor finding. Returns an integer vector containing the index of the nearest neighbor range in
subject
for each range inx
. If there is no nearest neighborNA
is returned. For details of the algorithm see the man page in IRanges, ?nearest
.precede: For each range in
x
,precede
returns the index of the range insubject
that is directly preceded by the range inx
. Overlapping ranges are excluded.NA
is returned when there are no qualifying ranges insubject
.follow: The opposite of
precede
,follow
returns the index of the range insubject
that is directly followed by the range inx
. Overlapping ranges are excluded.NA
is returned when there are no qualifying ranges insubject
.Orientation and Strand: The relevant orientation for
precede
andfollow
is 5' to 3', consistent with the direction of translation. Because positional numbering along a chromosome is from left to right and transcription takes place from 5' to 3',precede
andfollow
can appear to have ‘opposite’ behavior on the+
and
strand. Using positions 5 and 6 as an example, 5 precedes 6 on the+
strand but follows 6 on the
strand.A range with strand
*
can be compared to ranges on either the+
or
strand. Below we outline the priority when ranges on multiple strands are compared. Whenignore.strand=TRUE
all ranges are treated as if on the+
strand.x on + strand can match to ranges on both + and * strands. In the case of a tie the first range by order is chosen.
x on  strand can match to ranges on both  and * strands. In the case of a tie the first range by order is chosen.
x on * strand can match to ranges on any of +,  or * strands. In the case of a tie the first range by order is chosen.
distanceToNearest: Returns the distance for each range in
x
to its nearest neighbor in thesubject
.distance: Returns the distance for each range in
x
to the range iny
. The behavior ofdistance
has changed in Bioconductor 2.12. See the man page?distance
in IRanges for details.
Value
For nearest
, precede
and follow
, an integer
vector of indices in subject
, or a Hits if
select="all"
.
For distanceToNearest
, a Hits object with a
column for the query
index (queryHits), subject
index
(subjectHits) and the distance
between the pair.
For distance
, an integer vector of distances between the ranges
in x
and y
.
Author(s)
P. Aboyoun and V. Obenchain <vobencha@fhcrc.org>
See Also
The GenomicRanges and GRanges classes.
The Ranges class in the IRanges package.
The Hits class in the S4Vectors package.
The nearestmethods man page in the IRanges package.

findOverlapsmethods for finding just the overlapping ranges.
The nearestmethods man page in the GenomicFeatures package.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65  ## 
## precede() and follow()
## 
query < GRanges("A", IRanges(c(5, 20), width=1), strand="+")
subject < GRanges("A", IRanges(rep(c(10, 15), 2), width=1),
strand=c("+", "+", "", ""))
precede(query, subject)
follow(query, subject)
strand(query) < ""
precede(query, subject)
follow(query, subject)
## ties choose first in order
query < GRanges("A", IRanges(10, width=1), c("+", "", "*"))
subject < GRanges("A", IRanges(c(5, 5, 5, 15, 15, 15), width=1),
rep(c("+", "", "*"), 2))
precede(query, subject)
precede(query, rev(subject))
## ignore.strand=TRUE treats all ranges as '+'
precede(query[1], subject[4:6], select="all", ignore.strand=FALSE)
precede(query[1], subject[4:6], select="all", ignore.strand=TRUE)
## 
## nearest()
## 
## When multiple ranges overlap an "arbitrary" range is chosen
query < GRanges("A", IRanges(5, 15))
subject < GRanges("A", IRanges(c(1, 15), c(5, 19)))
nearest(query, subject)
## select="all" returns all hits
nearest(query, subject, select="all")
## Ranges in 'x' will selfselect when 'subject' is present
query < GRanges("A", IRanges(c(1, 10), width=5))
nearest(query, query)
## Ranges in 'x' will not selfselect when 'subject' is missing
nearest(query)
## 
## distance(), distanceToNearest()
## 
## Adjacent, overlap, separated by 1
query < GRanges("A", IRanges(c(1, 2, 10), c(5, 8, 11)))
subject < GRanges("A", IRanges(c(6, 5, 13), c(10, 10, 15)))
distance(query, subject)
## recycling
distance(query[1], subject)
## zerowidth ranges
zw < GRanges("A", IRanges(4,3))
stopifnot(distance(zw, GRanges("A", IRanges(3,4))) == 0L)
sapply(3:3, function(i)
distance(shift(zw, i), GRanges("A", IRanges(4,3))))
query < GRanges(c("A", "B"), IRanges(c(1, 5), width=1))
distanceToNearest(query, subject)
## distance() with GRanges and TxDb see the
## ?'distance,GenomicRanges,TxDbmethod' man
## page in the GenomicFeatures package.
