| gseq.kmer.dist | R Documentation |
Counts the occurrence of all k-mers (of size k) within the specified genomic intervals, optionally excluding masked regions.
gseq.kmer.dist(intervals, k = 6L, mask = NULL)
intervals |
Genomic intervals to analyze |
k |
Integer k-mer size (1-10). Default is 6. |
mask |
Optional intervals to exclude from counting. Positions within the mask will not contribute to k-mer counts. |
A data frame with columns:
Character string representing the k-mer sequence
Number of occurrences of this k-mer
Only k-mers with count > 0 are included. K-mers containing N bases are not counted.
gseq.extract, gseq.kmer
gdb.init_examples()
# Count all 6-mers in first 10kb of chr1
intervals <- data.frame(chrom = "chr1", start = 0, end = 10000)
kmer_dist <- gseq.kmer.dist(intervals, k = 6)
head(kmer_dist)
# Count dinucleotides
dinucs <- gseq.kmer.dist(intervals, k = 2)
dinucs
# Count with mask
mask <- data.frame(chrom = "chr1", start = 5000, end = 6000)
kmer_dist_masked <- gseq.kmer.dist(intervals, k = 6, mask = mask)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.