fastqKmerSubsetLocs: fastqKmerSubsetLocs function: Counts for a given DNA k-mer...
In seqTools: Analysis of nucleotide, sequence and quality content on fastq files

Description Usage Arguments Details Value Author(s) References Examples

Reads (compressed) FASTQ files and counts for given DNA k-mer subset for each position in sequence. The k-mer subset is given by a vector of k-mer indices. k-mer indices can be obtained from DNA k-mers with the function kMerIndex.

1	fastqKmerSubsetLocs(filenames, k=4, kIndex)

`filenames`	`character`. Vector of fastqKmerSubsetLocs file names. Files can be gz compressed.
`k`	`integer`. Length of counted DNA k-mers.
`kIndex`	`integer`. Numeric values which represent indices of DNA-k mers.

Maximal allowed value for k is 12.

list. The length of the list equals the number of given filenames. Contains for each given file a matrix. Each matrix has one row for each given kIndex and an additional row with counts for all other DNA k-mers (labeled other). The number of columns equals the maximal sequence length in the FASTQ file.

Wolfgang Kaisers

Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM The sanger FASTQ file format for sequences with quality scores and the Solexa/Illumina FASTQ variants. Nucleic Acids Research 2010 Vol.38 No.6 1767-1771

basedir <- system.file("extdata", package="seqTools")
setwd(basedir)
k <- 4
kMers <- c("AAAA", "AACC", "AAGG")
kIdx <- kMerIndex(kMers)
res <- fastqKmerSubsetLocs("test_l6.fq", k, kIdx)