gene: Annotate genomic positions with gene transcript information

Description Usage Arguments Details Value Author(s) Examples

Description

Annotates genomic positions (chromosome and position) with information about nearby transcripts from a UCSC genome browser table such as GENCODE.

Usage

1
2
3
4
5
6
gene.annotate(chr, pos, genetable, win.size = 10000)
gene.nearest(chr, pos, genetable)
gene.draw(chr, leftpos, rightpos,
          genetable, nodraw = NULL, 
          genesep = 10000, hlines.min = NULL, yhi = -1, ylo = -5,
          exony = 0.05, genecex = 1)

Arguments

chr

A vector of chromosome numbers

pos

A vector of chromosomal base pair positions (one-based)

genetable

A data frame of a UCSC genome browser table

win.size

Size of window to include transcripts for

leftpos

Leftmost position to draw transcripts for

rightpos

Rightmost position to draw transcripts for

nodraw

A list of gene names not to draw

genesep

Desired space between genes drawn on same horizontal line

hlines.min

Minimum number of horizontal lines to use

yhi

Y coordinate of highest line

ylo

Y coordinate of lowest line

exony

Y coordinate height for exons

genecex

cex parameter for gene names

Details

genetable could be the refgene or GENCODE table from UCSC. It must have columns “chrom”, “txStart”, “txEnd”, “name2”, and for gene.draw it must also have columns “exonStarts” and “exonEnds”. Note that all positions in genetable are assumed to be zero-based, but function argument pos is assumed to be one-based.

The algorithm for gene.nearest is peculiar and may change in future versions. If the queried position is inside one or more transcripts, their names are concatenated (comma separated). Otherwise, the names of the nearest transcripts to the left and right are concatendated (hyphen separated). This may produce curious output for transcript names that themselves contain a hyphen.

gene.draw draws representations of all genes in the specified region. For each gene (determined by unique values of “name2”), all transcripts are drawn in an overlapping style, with yellow blocks for exons and black lines for introns, and “name2” written below. Different genes are distributed over a number of lines, chosen adaptively to avoid horizontal overlap. In large or gene dense regions, this may result in overlap of the gene names with the vertical overlap.

Value

Character string, or plot.

Author(s)

Toby Johnson Toby.x.Johnson@gsk.com

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
## Not run: 
## genome wide annotation 
genetable <- read.table(gzfile("path/to/UCSC-genome-tables/hg19.GENCODE14.gz"),
                          header = TRUE, comment.char = "", sep = "\t", as.is = TRUE)

## End(Not run)

data(gencode14.UGT1A1)
leftrightpos <- c(234568893, 234781945)
plot.new()
plot.window(leftrightpos, c(-5, 5))
abline(h = 0, col = "grey")
gene.draw(2, leftrightpos[1], leftrightpos[2], gencode14.UGT1A1, genesep = 5000)
axis(1, at = pretty(leftrightpos * 1e-6)*1e6, labels = pretty(leftrightpos * 1e-6))
title(xlab = "chr2 genomic position (Mb)")
axis(2, at = 0:5, las = 1)
box()

querypos <- c(234680000, 234685000)
points(querypos, rep(0, 2), pch = 23)
gene.nearest(c("chr2", "chr2"), querypos, gencode14.UGT1A1)

gene.annotate("chr2", querypos[1], gencode14.UGT1A1) # genes within 10kb
gene.annotate("chr2", querypos[1], gencode14.UGT1A1, win.size = 0) # genes within 0kb
gene.annotate(2, querypos[1], gencode14.UGT1A1, win.size = 0) # same

tobyjohnson/gtx documentation built on Aug. 30, 2019, 8:07 p.m.