countGenomeKmers: countGenomeKmers: Counting K-mers in DNA sequences.

Description Usage Arguments Details Value Author(s) Examples

View source: R/kMer.R

Description

Counts K-mers of DNA sequences inside a vector of DNA sequences. The k-mers are searched in a set of search windows, which are defined by start and width parameter. From each position of the search window, a DNA k-mer is identified on the right hand side on the given DNA sequence. Each value in the start vector defindes the left border of a search window. The size of the search window is given by the appropriate value in the width vector. The function is intended to count DNA k-mers in selected regions (e.g. exons) on DNA chromosomes while respecting strand orientation.

Usage

1
countGenomeKmers(dna, seqid, start, width, strand, k)

Arguments

dna

character. Vector of DNA sequences. dna must not contain other characters than "ATCGN". Capitalization does not matter. When a 'N' character is found, the current DNA k-mer is skipped.

seqid

numeric. Vector of (1-based) values describing the index of the analyzed sequences inside the given dna vector.

start

numeric. Vector of (1-based) start positions for reading windows.

width

numeric. Vector of window width values.

strand

factor or numeric. First factor level (or numeric: 1) value will be interpreted as (+)-strand. For any other values, the reversed complement sequence will be counted (in left direction from start value).

k

numeric. Number of nucleotides in tabled DNA motifs. Only a single value is allowed (length(n) = 1!)

Details

The function returns a matrix. Each colum contains the motif-count values for one frame. Each row represents one DNA motif. The DNA sequence of the DNA motif is given as row.name.

Value

matrix.

Author(s)

Wolfgang Kaisers

Examples

1
2
3
4
5
6
7
sq <- "TTTTTCCCCGGGGAAAA"
seqid <- as.integer(c(1, 1))
start <- as.integer(c(6, 14))
width <- as.integer(c(4, 4))
strand <- as.integer(c(1, 0))
k <- 2
countGenomeKmers(sq, seqid, start, width, strand, k)

Example output

Loading required package: zlibbioc
   1 2
AA 0 0
AC 0 0
AG 0 0
AT 0 0
CA 0 0
CC 3 0
CG 1 0
CT 0 0
GA 0 0
GC 0 0
GG 0 0
GT 0 0
TA 0 0
TC 0 1
TG 0 0
TT 0 3

seqTools documentation built on Nov. 8, 2020, 5:20 p.m.