fastqKmerSubsetLocs: fastqKmerSubsetLocs function: Counts for a given DNA k-mer...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/seqTools.R

Description

Reads (compressed) FASTQ files and counts for given DNA k-mer subset for each position in sequence. The k-mer subset is given by a vector of k-mer indices. k-mer indices can be obtained from DNA k-mers with the function kMerIndex.

Usage

1
fastqKmerSubsetLocs(filenames, k=4, kIndex)

Arguments

filenames

character. Vector of fastqKmerSubsetLocs file names. Files can be gz compressed.

k

integer. Length of counted DNA k-mers.

kIndex

integer. Numeric values which represent indices of DNA-k mers.

Details

Maximal allowed value for k is 12.

Value

list. The length of the list equals the number of given filenames. Contains for each given file a matrix. Each matrix has one row for each given kIndex and an additional row with counts for all other DNA k-mers (labeled other). The number of columns equals the maximal sequence length in the FASTQ file.

Author(s)

Wolfgang Kaisers

References

Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM The sanger FASTQ file format for sequences with quality scores and the Solexa/Illumina FASTQ variants. Nucleic Acids Research 2010 Vol.38 No.6 1767-1771

Examples

1
2
3
4
5
6
basedir <- system.file("extdata", package="seqTools")
setwd(basedir)
k <- 4
kMers <- c("AAAA", "AACC", "AAGG")
kIdx <- kMerIndex(kMers)
res <- fastqKmerSubsetLocs("test_l6.fq", k, kIdx)

seqTools documentation built on Nov. 8, 2020, 5:20 p.m.