topologyOnTranscripts: Extracting the topology on transcripts

Description Usage Arguments Details Value Examples

View source: R/topologyOnTranscripts.R

Description

A function to extract meta-transcript topologies of the input GRanges object on a given transcript annotation.

Usage

1
2
3
4
5
6
7
topologyOnTranscripts(
  x,
  txdb,
  region_weights = c(1/3, 1/3, 1/3),
  ambiguityMethod = c("mean", "sum", "min", "max"),
  ignore.strand = FALSE
)

Arguments

x

A GRanges object for the genomic ranges to be annotated.

txdb

A TxDb or EnsDb object for the transcript annotation.

region_weights

A numeric vector of length 3 indicating the weights for 5'UTR, CDS, and 3'UTR; default is c(1/3,1/3,1/3).

ambiguityMethod

If ambiguityMethod is "mean" (default), "sum", "min", or "max", then the mean, sum, minimum, and maximum values of the >1 mapping will be returned in the output value.

ignore.strand

When set to TRUE, the strand information is ignored in the overlap calculations.

Details

The meta-tx topology is calculated based on the weighted sum of the relative position of x on exonic 5'UTR, CDS, and 3'UTR, that is, the topology = sum(c(rel_pos_5UTR, rel_pos_CDS, rel_pos_3UTR)*region_weights). The rel_pos for x instances not mapped to the region is equal to 1 or 0. Specifically, rel_pos is 1 if x overlaps at a downstream region, rel_pos is 0 if x overlaps at an upstream region. The topology values for x that is not mapped to any one of the 5'UTR, CDS, and 3'UTR will be set to NA.

Value

A numeric vector with the same length as x.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
## Load the TxDb object
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene

## Define the query GRanges
set.seed(01)

query_gr <- GRanges(rep(c("chr1", "chr2"), c(5, 15)),
                IRanges(c(sample(11874:12127, 5), sample(38814:41527, 15)), width=1),
                strand=Rle(c("+", "-"), c(5, 15)))
                
## Extract the meta-tx topology values
topologyOnTranscripts(query_gr, txdb)

## Visualize the logistic regression curve on binary classification of a m6A miCLIP dataset

GSE63753_sysy <- readRDS(system.file("extdata", "GSE63753_sysy.rds", package = "WhistleR"))

GSE63753_sysy$topology <- topologyOnTranscripts(GSE63753_sysy, txdb)

library(ggplot2)

ggplot(na.omit(as.data.frame(mcols(GSE63753_sysy))), aes(topology, target)) +
  geom_smooth(formula = y ~ splines::ns(x, 8), method = "glm", method.args = list(family = "binomial")) +
  geom_vline(xintercept = c(0.33, 0.66), linetype = 2) +
  geom_hline(yintercept = 0.5, linetype = 3) +
  geom_text(aes(x=x,y=y,label=text),
            data = data.frame(
               x = c(0.165, 0.495, 0.825),
               y = c(0.1, 0.1, 0.1),
               text = c("5'UTR","CDS", "3'UTR")
  )) +
  scale_x_continuous(breaks = c(0, 0.33, 0.66, 0.9)) + scale_y_continuous(breaks = c(0, 0.25, 0.5, 0.75, 1)) +
  theme_classic() + labs(x = "meta-tx topology", y = "prob of m6A = 1", title = "LR fit with cubic splines")

ZW-xjtlu/WhistleR documentation built on March 13, 2021, 10:50 a.m.