Set operations on genomic ranges

Description

Performs set operations on GRanges and GRangesList objects.

NOTE: The punion, pintersect, psetdiff, and pgap generic functions and methods for Ranges objects are defined and documented in the IRanges package.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Vector-wise set operations
## --------------------------

## S4 method for signature 'GenomicRanges,GenomicRanges'
union(x, y, ignore.strand=FALSE)

## S4 method for signature 'GenomicRanges,GenomicRanges'
intersect(x, y, ignore.strand=FALSE)

## S4 method for signature 'GenomicRanges,GenomicRanges'
setdiff(x, y, ignore.strand=FALSE)

## Element-wise (aka "parallel") set operations
## --------------------------------------------

## S4 method for signature 'GRanges,GRanges'
punion(x, y, fill.gap=FALSE, ignore.strand=FALSE)

## S4 method for signature 'GRanges,GRanges'
pintersect(x, y, drop.nohit.ranges=FALSE,
           ignore.strand=FALSE, strict.strand=FALSE)

## S4 method for signature 'GRanges,GRanges'
psetdiff(x, y, ignore.strand=FALSE)

Arguments

x, y

For union, intersect, and setdiff: 2 GenomicRanges objects or 2 GRangesList objects.

For punion and pintersect: 2 GRanges objects, or 1 GRanges object and 1 GRangesList object.

For psetdiff: x must be a GRanges object and y can be a GRanges or GRangesList object.

For pgap: 2 GRanges objects.

In addition, for the parallel operations, x and y must be of equal length (i.e. length(x) == length(y)).

fill.gap

Logical indicating whether or not to force a union by using the rule start = min(start(x), start(y)), end = max(end(x), end(y)).

ignore.strand

For set operations: If set to TRUE, then the strand of x and y is set to "*" prior to any computation.

For parallel set operations: If set to TRUE, the strand information is ignored in the computation and the result has the strand information of x.

drop.nohit.ranges

If TRUE then elements in x that don't intersect with their corresponding element in y are removed from the result (so the returned object is no more parallel to the input).

If FALSE (the default) then nothing is removed and a hit metadata column is added to the returned object to indicate elements in x that intersect with the corresponding element in y. For those that don't, the reported intersection is a zero-width range that has the same start as x.

strict.strand

If set to FALSE (the default), features on the "*" strand are treated as occurring on both the "+" and "-" strand. If set to TRUE, the strand of intersecting elements must be strictly the same.

Details

The pintersect methods involving GRanges and/or GRangesList objects use the triplet (sequence name, range, strand) to determine the element by element intersection of features, where a strand value of "*" is treated as occurring on both the "+" and "-" strand (unless strict.strand is set to TRUE, in which case the strand of intersecting elements must be strictly the same).

The psetdiff methods involving GRanges and/or GRangesList objects use the triplet (sequence name, range, strand) to determine the element by element set difference of features, where a strand value of "*" is treated as occurring on both the "+" and "-" strand.

Value

For union, intersect, and setdiff: a GRanges object if x and y are GenomicRanges objects, and a GRangesList object if they are GRangesList objects.

For punion and pintersect: when x or y is not a GRanges object, an object of the same class as this non-GRanges object. Otherwise, a GRanges object.

For psetdiff: either a GRanges object when both x and y are GRanges objects, or a GRangesList object when y is a GRangesList object.

For pgap: a GRanges object.

Author(s)

P. Aboyoun and H. Pag├Ęs

See Also

  • setops-methods in the IRanges package for set operations on Ranges and RangesList objects.

  • findOverlaps-methods for finding/counting overlapping genomic ranges.

  • intra-range-methods and inter-range-methods for intra range and inter range transformations of a GRanges object.

  • GRanges and GRangesList objects.

  • mendoapply in the S4Vectors package.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
## ---------------------------------------------------------------------
## A. SET OPERATIONS
## ---------------------------------------------------------------------

x <- GRanges("chr1", IRanges(c(2, 9) , c(7, 19)), strand=c("+", "-"))
y <- GRanges("chr1", IRanges(5, 10), strand="-") 

union(x, y)
union(x, y, ignore.strand=TRUE)

intersect(x, y)
intersect(x, y, ignore.strand=TRUE)

setdiff(x, y)
setdiff(x, y, ignore.strand=TRUE)

## With 2 GRangesList objects:
gr1 <- GRanges(seqnames="chr2",
               ranges=IRanges(3, 6))
gr2 <- GRanges(seqnames=c("chr1", "chr1"),
               ranges=IRanges(c(7,13), width = 3),
               strand=c("+", "-"))
gr3 <- GRanges(seqnames=c("chr1", "chr2"),
               ranges=IRanges(c(1, 4), c(3, 9)),
               strand=c("-", "-"))
grlist <- GRangesList(gr1=gr1, gr2=gr2, gr3=gr3)

union(grlist, shift(grlist, 3))
intersect(grlist, shift(grlist, 3))
setdiff(grlist, shift(grlist, 3))

## Sanity checks:
grlist2 <- shift(grlist, 3)
stopifnot(identical(
    union(grlist, grlist2),
    mendoapply(union, grlist, grlist2)
))
stopifnot(identical(
    intersect(grlist, grlist2),
    mendoapply(intersect, grlist, grlist2)
))
stopifnot(identical(
    setdiff(grlist, grlist2),
    mendoapply(setdiff, grlist, grlist2)
))

## ---------------------------------------------------------------------
## B. PARALLEL SET OPERATIONS
## ---------------------------------------------------------------------

punion(x, shift(x, 6))
## Not run: 
punion(x, shift(x, 7))  # will fail

## End(Not run)
punion(x, shift(x, 7), fill.gap=TRUE)

pintersect(x, shift(x, 6))
pintersect(x, shift(x, 7))

psetdiff(x, shift(x, 7))

## ---------------------------------------------------------------------
## C. MORE EXAMPLES
## ---------------------------------------------------------------------

## GRanges object:
gr <- GRanges(seqnames=c("chr2", "chr1", "chr1"),
              ranges=IRanges(1:3, width = 12),
              strand=Rle(strand(c("-", "*", "-"))))

## Parallel intersection of a GRanges and a GRangesList object
pintersect(gr, grlist)
pintersect(grlist, gr)

## For a fast 'mendoapply(intersect, grlist, as(gr, "GRangesList"))'
## call pintersect() with 'strict.strand=TRUE' and call reduce() on
## the result with 'drop.empty.ranges=TRUE':
reduce(pintersect(grlist, gr, strict.strand=TRUE),
       drop.empty.ranges=TRUE)

## Parallel set difference of a GRanges and a GRangesList object
psetdiff(gr, grlist)