View source: R/sequence_features.R
topMotif | R Documentation |
Per leader, detect if the leader has a TOP motif at TSS (5' end of leader) TOP motif defined as: (C, then 4 pyrimidines)
topMotif(seqs, start = 1, stop = max(nchar(seqs)), return.sequence = TRUE)
seqs |
the sequences (character vector, DNAStringSet),
of 5' UTRs (leaders) start region.
seqs must be of minimum widths start - stop + 1 to be included.
|
start |
position in seqs to start at (first is 1), default 1. |
stop |
position in seqs to stop at (first is 1), default max(nchar(seqs)), that is the longest sequence length |
return.sequence |
logical, default TRUE, return as data.table with sequence as columns in addition to TOP class. If FALSE, return character vector. |
default: return.sequence == FALSE, a character vector of either TOP, C or OTHER. C means leaders started on C, Other means not TOP and did not start on C. If return.sequence == TRUE, a data.table is returned with the base per position in the motif is included as additional columns (per position called seq1, seq2 etc) and a id column called X.gene_id (with names of seqs).
## Not run:
if (requireNamespace("BSgenome.Hsapiens.UCSC.hg19")) {
txdbFile <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package = "GenomicFeatures")
#Extract sequences of Coding sequences.
leaders <- loadRegion(txdbFile, "leaders")
# Should update by CAGE if not already done
cageData <- system.file("extdata", "cage-seq-heart.bed.bgz",
package = "ORFik")
leadersCage <- reassignTSSbyCage(leaders, cageData)
# Get region to check
seqs <- startRegionString(leadersCage, NULL,
BSgenome.Hsapiens.UCSC.hg19::Hsapiens, 0, 4)
topMotif(seqs)
}
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.