linkOverlaps: Link overlapping regions

linkOverlapsR Documentation

Link overlapping regions

Description

Identify interactions that link two sets of regions by having anchor regions overlapping one entry in each set.

Usage

## S4 method for signature 'GInteractions,Vector,Vector'
linkOverlaps(query, subject1, subject2, ..., 
    ignore.strand=TRUE, use.region="both")

Arguments

query

A GInteractions or InteractionSet object.

subject1, subject2

A Vector object defining a set of genomic intervals, such as a GRanges or GRangesList. subject2 may be missing. Alternatively, both subject1 and subject2 may be Hits objects, see below.

..., ignore.strand

Additional arguments to be passed to findOverlaps. Note that ignore.strand=TRUE by default, as genomic interactions are usually unstranded.

use.region

A string specifying which query regions should be used to overlap which subject. Ignored if subject2 is missing.

Details

This function identifies all interactions in query where one anchor overlaps an entry in subject1 and the other anchor overlaps an entry in subject2. It is designed to be used to identify regions that are linked by interactions in query. For example, one might specify genes as subject1 and enhancers as subject2, to identify all gene-enhancer contacts present in query. This is useful when the exact pairings between subject1 and subject2 are undefined.

The function returns a DataFrame specifying the index of the interaction in query; the index of the overlapped region in subject1; and the index of the overlapped region in subject2. If multiple regions in subject1 and/or subject2 are overlapping the anchor regions of a particular interaction, all combinations of two overlapping regions (one from each subject* set) are reported for that interaction.

By default, use.region="both" such that overlaps will be considered between any first/second interacting region in query and either subject1 or subject2. If use.region="same", overlaps will only be considered between the first interacting region in query and entries in subject1, and between the second interacting region and subject2. The opposite applies with use.region="reverse", where the first and second interacting regions are overlapped with subject2 and subject1 respectively.

If subject2 is not specified, links within subject1 are identified instead, i.e., subject2 is set to subject1. In such cases, the returned DataFrame is such that the first subject index is always greater than the second subject index, to avoid redundant permutations.

Value

A DataFrame of integer indices indicating which elements of query link which elements of subject1 and subject2.

Using Hits as input

Hits objects can be used for the subject1 and subject2 arguments. These should be constructed using findOverlaps with regions(query) as the query and the genomic regions of interest as the subject. For example, the calls below:

> linkOverlaps(query, subject1) # 1
> linkOverlaps(query, subject1, subject2) # 2

will yield exactly the same output as:

> olap1 <- findOverlaps(regions(query), subject1)
> linkOverlaps(query, olap1) # 1
> olap2 <- findOverlaps(regions(query), subject2)
> linkOverlaps(query, olap1, olap2) # 2

This is useful in situations where regions(query) and the genomic regions in subject1 and subject2 are constant across multiple calls to linkOverlaps. In such cases, the overlaps only need to be calculated once, avoiding redundant work within linkOverlaps.

Author(s)

Aaron Lun

See Also

findOverlaps,GInteractions,Vector-method

Examples

example(GInteractions, echo=FALSE)

all.genes <- GRanges("chrA", IRanges(0:9*10, 1:10*10))
all.enhancers <- GRanges("chrB", IRanges(0:9*10, 1:10*10))

out <- linkOverlaps(gi, all.genes, all.enhancers)
head(out)

out <- linkOverlaps(gi, all.genes)
head(out)

# Same methods apply for InteractionSet objects.

example(InteractionSet, echo=FALSE)
out <- linkOverlaps(iset, all.genes, all.enhancers)
out <- linkOverlaps(iset, all.genes)

LTLA/InteractionSet documentation built on July 3, 2023, 8:44 a.m.