assign_splice_sites | R Documentation |
This function takes a data frame of intron coordinates and a genome sequence (ideally human or mouse) and returns a data frame with two additional columns for the donor and acceptor splice site consensus sequences. It prepares the donor and acceptor sequences based on the provided intron coordinates and the specified genome (e.g., human hg38), making it useful for downstream analysis of splicing events.
assign_splice_sites(input, genome = BSgenome.Hsapiens.UCSC.hg38, verbose = TRUE)
input |
A data frame containing intron coordinates with the following columns:
|
genome |
The genome sequence (BSgenome object) for the species. Default is the human genome (hg38). This object is required for extracting the consensus sequences from the genome at the specified intron positions. |
verbose |
Logical. If TRUE, the function prints progress messages while preparing the splice site data. Default is TRUE. |
This function performs the following steps:
First, it prepares the splice site sequences for both donor and acceptor sites by calculating their positions based on the strand orientation and intron coordinates. The donor splice site is typically located at the 5' end of the intron, while the acceptor splice site is at the 3' end.
The function utilizes the getSeq
function from the BSgenome
package to extract the nucleotide sequences for both donor and acceptor sites from the specified genome (default is hg38 for humans).
The resulting sequences are added as new columns (donor_ss
and acceptor_ss
) to the original input data frame.
The final data frame includes the splice site sequences for each intron, allowing for analysis of splicing efficiency or identification of consensus motifs.
A data frame containing the original intron data, with two additional columns:
donor_ss
: The donor splice site consensus sequence for each intron.
acceptor_ss
: The acceptor splice site consensus sequence for each intron.
extract_introns
, find_cryptic_splice_sites
suppressPackageStartupMessages(library(BSgenome.Hsapiens.UCSC.hg38))
file_v1 <- system.file("extdata", "gencode.v1.example.gtf.gz", package = "GencoDymo2")
gtf_v1 <- load_file(file_v1)
introns_df <- extract_introns(gtf_v1)
result <- assign_splice_sites(introns_df, genome = BSgenome.Hsapiens.UCSC.hg38)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.