truncateTxsPolyA | R Documentation |
Truncate transcripts at overlapping polyadenylation (polyA) sites to infer likely 3' ends of transcripts. This is crucial to correctly design TAP-seq primers that amplify fragments of specific lengths. Typically the exons of all annotated transcripts per target gene are provided as input. If a polyA site overlaps a single transcript of a given gene, this transcript is truncated and returned. In case a polyA site overlaps multiple transcripts of the same gene, a "metatranscript" consisting of all annotated exons of the overlapping transcripts is generated and truncated. No statements about expressed transcripts can be made if no overlapping polyA sites are found for any transcripts of a gene. In that case a "meta transcript" consisting of the merged exons of that gene is generated and returned.
truncateTxsPolyA(
transcripts,
polyA_sites,
extend_3prime_end = 0,
polyA_select = c("downstream", "upstream", "score"),
transcript_id = "transcript_id",
gene_id = "gene_id",
exon_number = "exon_number",
ignore_strand = FALSE,
parallel = FALSE
)
## S4 method for signature 'GRanges'
truncateTxsPolyA(
transcripts,
polyA_sites,
extend_3prime_end = 0,
polyA_select = c("downstream", "upstream", "score"),
transcript_id = "transcript_id",
gene_id = "gene_id",
exon_number = "exon_number",
ignore_strand = FALSE,
parallel = FALSE
)
## S4 method for signature 'GRangesList'
truncateTxsPolyA(
transcripts,
polyA_sites,
extend_3prime_end = 0,
polyA_select = c("downstream", "upstream", "score"),
transcript_id = "transcript_id",
gene_id = "gene_id",
exon_number = "exon_number",
ignore_strand = FALSE,
parallel = FALSE
)
transcripts |
A |
polyA_sites |
A |
extend_3prime_end |
Specifies how far (bp) 3' ends of transcripts should be extended when looking for overlapping polyA sites (default = 0). This enables capturing of polyA sites that occur downstream of annotated 3' ends. |
polyA_select |
Specifies which heuristic should be used to select the polyA site used to
truncate the transcripts if multiple overlapping polyA sites are found. By default
|
transcript_id |
(character) Name of the column in the metadata of |
gene_id, exon_number |
(character) Optional names of columns in metadata of
|
ignore_strand |
(logical) Specifies whether the strand of polyA sites should be ignored when
looking for overlapping polyA sites. Default is |
parallel |
(logical) Triggers parallel computing using the |
Either a GRanges
or
GRangesList
object containing the truncated
transcripts.
truncateTxsPolyA(GRanges)
: Truncate transcripts of one gene provided as GRanges
object
truncateTxsPolyA(GRangesList)
: Truncate transcripts of multiple genes provided as
GRangesList
library(GenomicRanges)
# protein-coding exons of genes within chr11 region
data("chr11_genes")
target_genes <- split(chr11_genes, f = chr11_genes$gene_name)
# only retain first 2 target genes, because truncating transcripts is currently computationally
# quite costly. try using BiocParallel for parallelization (see ?truncateTxsPolyA).
target_genes <- target_genes[1:2]
# example polyA sites for these genes
data("chr11_polyA_sites")
# truncate target genes at most downstream polyA site (default)
truncated_txs <- truncateTxsPolyA(target_genes, polyA_sites = chr11_polyA_sites)
# change polyA selection to "score" (read coverage of polyA sites) and extend 3' end of target
# genes by 50 bp (see ?truncateTxsPolyA).
truncated_txs <- truncateTxsPolyA(target_genes, polyA_sites = chr11_polyA_sites,
polyA_select = "score", extend_3prime_end = 50)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.