findSpliceOverlaps-methods: Classify ranges (reads) as compatible with existing genomic...
In Bioconductor/GenomicAlignments: Representation and manipulation of short genomic alignments

findSpliceOverlaps-methods

R Documentation

Classify ranges (reads) as compatible with existing genomic annotations or as having novel splice events

Description

The findSpliceOverlaps function identifies ranges (reads) that are compatible with a specific transcript isoform. The non-compatible ranges are analyzed for the presence of novel splice events.

Usage

findSpliceOverlaps(query, subject, ignore.strand=FALSE, ...)

## S4 method for signature 'GRangesList,GRangesList'
findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL)

## S4 method for signature 'GAlignments,GRangesList'
findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL)

## S4 method for signature 'GAlignmentPairs,GRangesList'
findSpliceOverlaps(query, subject, ignore.strand=FALSE, ..., cds=NULL)

## S4 method for signature 'BamFile,ANY'
findSpliceOverlaps(query, subject, ignore.strand=FALSE, ...,
                   param=ScanBamParam(), singleEnd=TRUE)

Arguments

`query`	A GRangesList, GAlignments, GAlignmentPairs, or BamFile object containing the reads. Can also be a single string containing the path to a BAM file. Single or paired-end reads are specified with the `singleEnd` argument (default FALSE). Paired-end reads can be supplied in a BAM file or GAlignmentPairs object. Single-end are expected to be in a BAM file, GAlignments or GRanges object.
`subject`	A GRangesList containing the annotations. This list is expected to contain exons grouped by transcripts.
`ignore.strand`	When set to `TRUE`, strand information is ignored in the overlap calculations.
`...`	Additional arguments such as `param` and `singleEnd` used in the method for BamFile objects. See below.
`cds`	Optional GRangesList of coding regions for each transcript in the `subject`. If provided, the "coding" output column will be a `logical` vector indicating if the read falls in a coding region. When not provided, the "coding" output is `NA`.
`param`	An optional `ScanBamParam` instance to further influence scanning, counting, or filtering.
`singleEnd`	A logical value indicating if reads are single or paired-end. See `summarizeOverlaps` for more information.

Details

When a read maps compatibly and uniquely to a transcript isoform we can quantify the expression and look for shifts in the balance of isoform expression. If a read does not map in compatible way, novel splice events such as splice junctions, novel exons or retentions can be quantified and compared across samples.

findSpliceOverlaps detects which reads (query) match to transcripts (subject) in a compatible fashion. Compatibility is based on both the transcript bounds and splicing pattern. Assessing the splicing pattern involves comparision of the read splices (i.e., the N operations in the CIGAR) with the transcript introns. For paired-end reads, the inter-read gap is not considered a splice junction. The analysis of non-compatible reads for novel splice events is under construction.

Value

The output is a Hits object with the metadata columns defined below. Each column is a logical indicating if the read (query) met the criteria.

compatible: Every splice (N) in a read alignment matches an intron in an annotated transcript. The read does not extend into an intron or outside the transcript bounds.
unique: The read is compatible with only one annotated transcript.
strandSpecific: The query (read) was stranded.

Note

WARNING: The current implementation of findSpliceOverlaps doesn't work properly on paired-end reads where the 2 ends overlap!

Author(s)

Michael Lawrence and Valerie Obenchain

Examples

## ----------------------------------------------------------------------- 
## Isoform expression :
## ----------------------------------------------------------------------- 
## findSpliceOverlaps() can assist in quantifying isoform expression
## by identifying reads that map compatibly and uniquely to a 
## transcript isoform.
library(TxDb.Dmelanogaster.UCSC.dm3.ensGene)
library(pasillaBamSubset)
se <- untreated1_chr4() ## single-end reads
txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene
exbytx <- exonsBy(txdb, "tx")
cdsbytx <- cdsBy(txdb, "tx")
param <- ScanBamParam(which=GRanges("chr4", IRanges(1e5,3e5)))
sehits <- findSpliceOverlaps(se, exbytx, cds=cdsbytx, param=param)

## Tally the reads by category to get an idea of read distribution.
lst <- lapply(mcols(sehits), table)
nms <- names(lst) 
tbl <- do.call(rbind, lst[nms]) 
tbl

## Reads compatible with one or more transcript isoforms.
rnms <- rownames(tbl)
tbl[rnms == "compatible","TRUE"]/sum(tbl[rnms == "compatible",])

## Reads compatible with a single isoform. 
tbl[rnms == "unique","TRUE"]/sum(tbl[rnms == "unique",])

## All reads fall in a coding region as defined by 
## the txdb annotation.
lst[["coding"]] 

## Check : Total number of reads should be the same across categories.
lapply(lst, sum)
 
## ----------------------------------------------------------------------- 
## Paired-end reads :
## ----------------------------------------------------------------------- 
## 'singleEnd' is set to FALSE for a BAM file with paired-end reads.
pe <- untreated3_chr4()
hits2 <- findSpliceOverlaps(pe, exbytx, singleEnd=FALSE, param=param)
 
## In addition to BAM files, paired-end reads can be supplied in a 
## GAlignmentPairs object. 
genes <- GRangesList(
    GRanges("chr1", IRanges(c(5, 20), c(10, 25)), "+"),
    GRanges("chr1", IRanges(c(5, 22), c(15, 25)), "+"))
galp <- GAlignmentPairs(
    GAlignments("chr1", 5, "11M4N6M", strand("+")),
    GAlignments("chr1", 50, "6M", strand("-")))
findSpliceOverlaps(galp, genes)

Bioconductor/GenomicAlignments documentation built on June 14, 2025, 5:42 p.m.

Bioconductor/GenomicAlignments index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Bioconductor/GenomicAlignments
Representation and manipulation of short genomic alignments

findSpliceOverlaps-methods: Classify ranges (reads) as compatible with existing genomic...
In Bioconductor/GenomicAlignments: Representation and manipulation of short genomic alignments

Classify ranges (reads) as compatible with existing genomic annotations or as having novel splice events

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Related to findSpliceOverlaps-methods in Bioconductor/GenomicAlignments...

R Package Documentation

Browse R Packages

We want your feedback!

Bioconductor/GenomicAlignments Representation and manipulation of short genomic alignments

findSpliceOverlaps-methods: Classify ranges (reads) as compatible with existing genomic... In Bioconductor/GenomicAlignments: Representation and manipulation of short genomic alignments

Classify ranges (reads) as compatible with existing genomic annotations or as having novel splice events

Description

Usage

Arguments

Details

Value

Note

Author(s)

See Also

Examples

Related to findSpliceOverlaps-methods in Bioconductor/GenomicAlignments...

R Package Documentation

Browse R Packages

We want your feedback!

Bioconductor/GenomicAlignments
Representation and manipulation of short genomic alignments

findSpliceOverlaps-methods: Classify ranges (reads) as compatible with existing genomic...
In Bioconductor/GenomicAlignments: Representation and manipulation of short genomic alignments