Split very long genes

Description

This function splits genes which have a very long range (e.g. 1 Mb), and new "genes" are formed where each isoform is its own "gene", with the suffix "_ls" and a number. It makes sense to turn each isoform into its own gene only if this function is followed by mergeGenes.

Usage

1
splitLongGenes(ebg, ebt, txdf, long = 1e+06)

Arguments

ebg

an exons-by-genes GRangesList, created with exonsBy

ebt

an exons-by-tx GRangesList, created with exonsBy

txdf

a data.frame created by running select on a TxDb object. Must have columns GENEID and TXID, where TXID corresponds to the names of ebt.

long

a numeric value such that ranges longer than this are "long"

Value

a list of manipulated ebg and txdf

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library(GenomicRanges)
txdf <- data.frame(GENEID=c("101","101","102"),
                   TXID=c("201","202","203"))
ebt <- GRangesList(GRanges("1",IRanges(c(100,200),width=50)),
                   GRanges("1",IRanges(2e6 + c(100,200),width=50)),
                   GRanges("1",IRanges(3e6 + c(100,200),width=50)))
names(ebt) <- c("201","202","203")
ebg <- GRangesList(reduce(unlist(ebt[1:2])),ebt[[3]])
names(ebg) <- c("101","102")
splitLongGenes(ebg, ebt, txdf)