call_transcripts_and_genes: Call transcript and gene models from corrected full-length...

Description Usage Arguments Value Details

View source: R/Call_transcript.R

Description

Call transcript and gene models from corrected full-length RNA-seq reads

Usage

1
2
3
4
5
6
7
8
call_transcripts_and_genes(
  long_reads,
  skip_minor_tx = 0.01,
  max_overlap_called = 0.1,
  min_read_width = 1000,
  min_overlap_fusion = 0.5,
  clust_threshold = 0.8
)

Arguments

long_reads

GRangesList object.

skip_minor_tx

Numeric in the range (0, 1), or NULL.

max_overlap_called

Numeric in the range [0, 1).

min_read_width

Positive integer.

min_overlap_fusion

Numeric in the range (0, 1].

clust_threshold

Numeric in the range (0, 1].

Value

List of length 5:

  1. GRanges object (HC, MC and LC genes);

  2. GRangesList object (HC, MC and LC transcripts);

  3. GRanges object (fusion genes);

  4. GRangesList object (fusion transcripts);

  5. GRangesList object (unused long reads outside of the called genes and transcripts).

Details

The input GRangesList object is returned by the detect_alignment_errors() function.
Long reads either marked as truncated by extend_long_reads_to_TSS_and_PAS(), or containing a misaligned exon (as revealed by detect_alignment_errors()), are skipped from the transcript calling procedure. The remaining long reads are collapsed into transcripts. The transcripts are classified into high confidence (HC), medium confidence (MC) and low confidence (LC) groups:

This iterative procedure of transcript calling ensures that highly expressed HC loci are not contaminated with less reliable MC or LC transcripts. The MC/LC transcripts are not guaranteed to be full-length. To decrease the risk of picking up products of partial RNA degradation, MC and LC transcripts can be called only from reads longer than min_read_length bp.
The called HC, MC and LC transcripts are clustered into HC, MC and LC genes, respectively. A pair of transcripts of the same type having overlap (intersect/union) above the clust_threshold are considered belonging to the same gene.
Within each gene, the minor transcripts (collectively representing up to skip_minor_tx fraction of the reads) are skipped from further consideration. To suppress this behavior, set skip_minor_tx = NULL.
Finally, transcripts which overlap at least two other disjoint transcripts by at least min_overlap_fusion fraction of their lengths, are considered fusion transcripts.


Maxim-Ivanov/TranscriptomeReconstructoR documentation built on Jan. 28, 2021, 11:47 a.m.