nearest-methods: Finding the nearest genomic tuple/range neighbour

Description Usage Arguments Details Value Author(s) See Also Examples

Description

The nearest, precede, follow, distance and distanceToNearest methods for GTuples objects and subclasses.

NOTE: These methods treat the tuples as if they were ranges, with ranges given by [pos_{1}, pos_{m}] and where m is the size,GTuples-method of the tuples. This is done via inheritance so that a GTuples object is treated as a GRanges and the appropriate method is dispatched upon.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## S4 method for signature 'GTuples,GTuples'
precede(x, subject, select = c("arbitrary", "all"), 
        ignore.strand = FALSE, ...)
## S4 method for signature 'GTuples,missing'
precede(x, subject, select = c("arbitrary", "all"), 
        ignore.strand = FALSE, ...)

## S4 method for signature 'GTuples,GTuples'
follow(x, subject, select = c("arbitrary", "all"), 
       ignore.strand=FALSE, ...)
## S4 method for signature 'GTuples,missing'
follow(x, subject, select = c("arbitrary", "all"), 
       ignore.strand = FALSE, ...)

## S4 method for signature 'GTuples,GTuples'
nearest(x, subject, select = c("arbitrary", "all"), 
        ignore.strand = FALSE, ...)
## S4 method for signature 'GTuples,missing'
nearest(x, subject, select = c("arbitrary", "all"), 
        ignore.strand = FALSE, ...)

## S4 method for signature 'GTuples,GTuples'
distanceToNearest(x, subject, ignore.strand = FALSE, 
                  ...)
## S4 method for signature 'GTuples,missing'
distanceToNearest(x, subject, ignore.strand = FALSE, 
                  ...)

## S4 method for signature 'GTuples,GTuples'
distance(x, y, ignore.strand = FALSE, ...)

Arguments

x

The query GTuples instance.

subject

The subject GTuples instance within which the nearest neighbours are found. Can be missing, in which case x is also the subject.

y

For the distance method, a GTuples or GRanges instance. Cannot be missing. If x and y are not the same length, the shortest will be recycled to match the length of the longest.

select

Logic for handling ties. By default, all methods select a single tuple/range (arbitrary for nearest, the first by order in subject for precede, and the last for follow).

When select = "all" a Hits object is returned with all matches for x. If x does not have a match in subject the x is not included in the Hits object.

ignore.strand

A logical indicating if the strand of the input tuples/ranges should be ignored. When TRUE, strand is set to '+'.

...

Additional arguments for methods.

Details

Value

For nearest, precede and follow, an integer vector of indices in subject, or aHits if select = "all".

For distanceToNearest, a Hits object with a column for the query index (from), subject index (to) and the distance between the pair.

For distance, an integer vector of distances between the tuples/ranges in x and y.

Author(s)

Peter Hickey for methods involving GTuples. P. Aboyoun and V. Obenchain <vobencha@fhcrc.org> for all the real work underlying the powerful nearest methods.

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
  ## -----------------------------------------------------------
  ## precede() and follow()
  ## -----------------------------------------------------------
  query <- GTuples("A", matrix(c(5L, 20L, 6L, 21L), ncol = 2), strand = "+")
  subject <- GTuples("A", matrix(c(rep(c(10L, 15L), 2), rep(c(11L, 16L), 2)), 
                                 ncol = 2),
                          strand = c("+", "+", "-", "-"))
  precede(query, subject)
  follow(query, subject)
 
  strand(query) <- "-"
  precede(query, subject)
  follow(query, subject)
 
  ## ties choose first in order
  query <- GTuples("A", matrix(c(10L, 11L), ncol = 2), c("+", "-", "*"))
  subject <- GTuples("A", matrix(c(rep(c(5L, 15L), each = 3), 
                                   rep(c(6L, 16L), each = 3)), ncol = 2),
                          rep(c("+", "-", "*"), 2))
  precede(query, subject)
  precede(query, rev(subject))
 
  ## ignore.strand = TRUE treats all ranges as '+'
  precede(query[1], subject[4:6], select="all", ignore.strand = FALSE)
  precede(query[1], subject[4:6], select="all", ignore.strand = TRUE)
  
  ## -----------------------------------------------------------
  ## nearest()
  ## -----------------------------------------------------------
  ## When multiple tuples overlap an "arbitrary" tuple is chosen
  query <- GTuples("A", matrix(c(5L, 15L), ncol = 2))
  subject <- GTuples("A", matrix(c(1L, 15L, 5L, 19L), ncol = 2))
  nearest(query, subject)
 
  ## select = "all" returns all hits
  nearest(query, subject, select = "all")
 
  ## Tuples in 'x' will self-select when 'subject' is present
  query <- GTuples("A", matrix(c(1L, 10L, 6L, 15L), ncol = 2))
  nearest(query, query)
 
  ## Tuples in 'x' will not self-select when 'subject' is missing
  nearest(query)
  
  ## -----------------------------------------------------------
  ## distance(), distanceToNearest()
  ## -----------------------------------------------------------
  ## Adjacent, overlap, separated by 1
  query <- GTuples("A", matrix(c(1L, 2L, 10L, 5L, 8L, 11L), ncol = 2))
  subject <- GTuples("A", matrix(c(6L, 5L, 13L, 10L, 10L, 15L), ncol = 2))
  distance(query, subject)

  ## recycling
  distance(query[1], subject)

  query <- GTuples(c("A", "B"), matrix(c(1L, 5L, 2L, 6L), ncol = 2))
  distanceToNearest(query, subject)

Example output

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: IRanges
Loading required package: GenomeInfoDb
[1]  1 NA
[1] NA  2
[1] NA  4
[1]  3 NA
[1] 4 2 2
[1] 1 4 1
Hits object with 2 hits and 0 metadata columns:
      queryHits subjectHits
      <integer>   <integer>
  [1]         1           1
  [2]         1           3
  -------
  queryLength: 1 / subjectLength: 3
Hits object with 3 hits and 0 metadata columns:
      queryHits subjectHits
      <integer>   <integer>
  [1]         1           1
  [2]         1           2
  [3]         1           3
  -------
  queryLength: 1 / subjectLength: 3
[1] 2
Hits object with 2 hits and 0 metadata columns:
      queryHits subjectHits
      <integer>   <integer>
  [1]         1           1
  [2]         1           2
  -------
  queryLength: 1 / subjectLength: 2
[1] 1 2
[1] 2 1
[1] 0 0 1
[1] 0 0 7
Hits object with 1 hit and 1 metadata column:
      queryHits subjectHits |  distance
      <integer>   <integer> | <integer>
  [1]         1           2 |         2
  -------
  queryLength: 2 / subjectLength: 3

GenomicTuples documentation built on Nov. 8, 2020, 6:43 p.m.