Annotate clusters with respect to transcript features

Share:

Description

Carries out strand-specific annotation of clusters with respect to distinct transcript features, particularly introns, coding sequences, 3'-UTRs, 5'-UTRs. Mapping to multiple features and to those outside the above mentioned ones are reported. Unmapped clusters are then futher further analyzed and annotated with respect to features localizing on the anti-sense strand. Results can be plotted as dotchart and annotations are returned as clusters metadata.

Usage

1
2
annotateClusters(clusters, txDB = NULL, genome = "hg19", tablename =
"ensGene", plot = TRUE, verbose = TRUE)

Arguments

clusters

GRanges object containing individual clusters as identified by the getClusters function

txDB

TranscriptDb object obtained through a call to the makeTxDbFromUCSC function in the GenomicFeatures package. Default is NULL, namely the object will be fetched internally

genome

A character specifying the genome abbreviation used by UCSC. Available abbreviations are returned by a call to ucscGenomes()[ , "db"]. Default is "hg19" (human genome)

tablename

A character specifying the name of the UCSC table containing the transcript annotations to retrieve. Available table names are returned by a call to supportedUCSCtables(). Default is "ensGene", namely ensembl gene annotations

plot

Logical, if TRUE a dotchart with cluster annotations is produced

verbose

Logical, if TRUE processing steps are printed

Value

Same as the input GRanges object, with an additional metadata column containing the following character encoding of the genomic feature each cluster maps to:

"CDS ss"

Coding Sequence Sense Strand

"Introns ss"

Intron Sense Strand

"3' UTR ss"

3' UTR Sense Strand

"5' UTR ss"

5' UTR Sense Strand

"Multiple"

More than one of the above

"CDS as"

Coding Sequence Antisense Strand

"Introns as"

Intron Antisense Strand

"3' UTR as"

3' UTR Antisense Strand

"5' UTR as"

5' UTR Antisense Strand

"Other"

None of the above

If plot=TRUE, a dotchart is produced in addition.

Author(s)

Federico Comoglio

References

M. Carlson and H. Pages and P. Aboyoun and S. Falcon and M. Morgan and D. Sarkar and M. Lawrence, GenomicFeatures: Tools for making and manipulating transcript centric annotations, R package version 1.12.4

Comoglio F, Sievers C and Paro R (2015) Sensitive and highly resolved identification of RNA-protein interaction sites in PAR-CLIP data, BMC Bioinformatics 16, 32.

See Also

getClusters

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
require(BSgenome.Hsapiens.UCSC.hg19)

data( model, package = "wavClusteR" ) 

filename <- system.file( "extdata", "example.bam", package = "wavClusteR" )
example <- readSortedBam( filename = filename )
countTable <- getAllSub( example, minCov = 10, cores = 1 )
highConfSub <- getHighConfSub( countTable, supportStart = 0.2, supportEnd = 0.7, substitution = "TC" )
coverage <- coverage( example )
clusters <- getClusters( highConfSub = highConfSub, 
                         coverage = coverage, 
                         sortedBam = example, 
	                 method = 'mrn', 
	                 cores = 1, 
	                 threshold = 2 ) 

fclusters <- filterClusters( clusters = clusters, 
		             highConfSub = highConfSub, 
        		     coverage = coverage,
			     model = model, 
			     genome = Hsapiens, 
		             refBase = 'T', 
		             minWidth = 12 )
## Not run: fclusters <- annotateClusters( clusters = fclusters )

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.