View source: R/Refine_by_annotation.R
refine_transcripts_by_annotation | R Documentation |
Refine called transcripts by an existing transcript model
refine_transcripts_by_annotation(
hml_tx,
annot_exons,
tss,
pas,
fusion_tx = GenomicRanges::GRangesList(),
max_exon_diff = 10,
tx_flanks_up = c(-100, 100),
tx_flanks_down = c(-100, 100),
min_score_2 = 5,
min_tx_cov = 0.95,
clust_threshold = 0.8,
min_overlap_fusion = 0.5
)
hml_tx, annot_exons, fusion_tx |
|
tss, pas |
|
max_exon_diff |
Positive integer. |
tx_flanks_up, tx_flanks_down |
Integer vectors of length 2. |
min_score_2 |
Non-negative numeric. |
min_tx_cov |
Numeric in the range (0, 1]. |
clust_threshold |
Numeric in the range (0, 1]. |
min_overlap_fusion |
Numeric in the range (0, 1]. |
List of length 4:
GRanges
object (updated HC, MC and LC genes);
GRangesList
object (refined HC, MC and LC transcripts);
GRanges
object (updated fusion genes);
GRangesList
object (updated fusion transcripts);
hml_tx
is the called transcript model (the second element in the list returned by the call_transcripts_and_genes()
function).
fusion_tx
is the called set of fusion transcripts (the fourth element in the list returned by the call_transcripts_and_genes()
function).
annot_exons
is a known transcript model (returned by e.g. exonsBy(txdb, by = "tx")
, where txdb
is a GenomicFeatures
object).
The function aims to adjust the called transcripts by the annotated transcripts:
5'- and 3'-borders of called exons are adjusted to the most similar border of an annotated exon (by not more than max_exon_diff
bp);
Annotated transcripts are classified into valid and non-valid. A valid known transcript must overlap with called TSS and PAS (both having scores above min_score_2
)
within tx_flanks_up
and tx_flanks_down
bp windows around its start and end, respectively;
5'- and/or 3'-borders of called MC and LC transcripts lacking overlap with TSS and/or PAS are adjusted to the borders of the most similar mate
among the valid annotated transcripts (given that at least min_tx_cov
fraction of the called transcript is covered by the annotated mate);
Valid annotated transcripts which do not overlap with any called transcript are copied from the annotation to the called HC transcript set;
In addition, the set of fusion transcripts is updated by finding called transcripts which overlap at least two valid annotated transcripts
(or an annotated and a called transcript) by at least min_overlap_fusion
fraction of their lengths.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.