extract_ss_motif | R Documentation |
This function extracts splice site motifs (5' splice site (5ss) or 3' splice site (3ss)) from a genomic dataset. It retrieves the donor or acceptor splice site motifs for each intron, based on the strand orientation, and compiles them into a FASTA file, which can be used for further analysis (e.g., MaxEntScan).
extract_ss_motif(input, genome, type, verbose, save_fasta, output_file)
input |
A data frame containing genomic information with the following required columns:
|
genome |
A genome object from the BSgenome package (default is |
type |
A string indicating which splice site motif to extract. One of |
verbose |
Logical; if |
save_fasta |
Logical; if |
output_file |
A string specifying the output file path and name for the FASTA file. If |
This function performs the following steps:
Based on the type
argument, the function prepares coordinates for extracting either donor (5ss) or acceptor (3ss) splice site motifs,
adjusting the motif start and end positions depending on the strand orientation.
The motif sequences are then extracted from the specified genome using the getSeq
function from the BSgenome package.
If save_fasta
is TRUE
, a FASTA file is generated containing the extracted motifs, with transcript IDs and intron numbers
used as FASTA headers.
A data frame with:
donor_ss_motif
or acceptor_ss_motif
: 9bp (5' ss) or 23bp (3' ss) sequence.
Genomic coordinates and transcript metadata.
assign_splice_sites
, df_to_fasta
file_v1 <- system.file("extdata", "gencode.v1.example.gtf.gz", package = "GencoDymo2")
gtf_v1 <- load_file(file_v1)
introns <- extract_introns(gtf_v1)
suppressPackageStartupMessages(library(BSgenome.Hsapiens.UCSC.hg38))
# Extract donor splice site motifs
motifs_df <- extract_ss_motif(introns, BSgenome.Hsapiens.UCSC.hg38, "5ss", verbose = FALSE)
# Extract acceptor splice site motifs without saving the FASTA file
motifs_df <- extract_ss_motif(introns, BSgenome.Hsapiens.UCSC.hg38, "3ss", verbose = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.