# calcKL: Calculate the Kullback-Leibler Divergence Between the k-mer... In qrqc: Quick Read Quality Control

## Description

`calcKL` takes in an object that inherits from `SequenceSummary` that has a kmers slot, and returns the terms of the K-L divergence sum (which correspond to items in the sample space, in this case, k-mers).

## Usage

 `1` ``` calcKL(x) ```

## Arguments

 `x` an S4 object a class that inherits from `SequenceSummary`.

## Value

`calcKL` returns a `data.frame` with columns:

 `kmer` the k-mer sequence. `position` the position in the read. `kl` the K-L term for this k-mer in the K-L sum, calculated as p(i)*log2(p(i)/q(i)). `p` the probability for this k-mer, at this position. `q` the probability for this k-mer across all positions.

## Note

The K-L divergence calculation in `calcKL` uses base 2 in the log; the units are in bits.

## Author(s)

Vince Buffalo <vsbuffalo@ucdavis.edu>

 ``` 1 2 3 4 5 6 7 8 9 10 11``` ``` ## Load a somewhat contaminated FASTQ file s.fastq <- readSeqFile(system.file('extdata', 'test.fastq', package='qrqc'), hash.prop=1) ## As with getQual, this function is provided so custom graphics can ## be made easily. For example K-L divergence by position: kld <- with(calcKL(s.fastq), aggregate(kl, list(position), sum)) colnames(kld) <- c("position", "KL") p <- ggplot(kld) + geom_line(aes(x=position, y=KL), color="blue") p + scale_y_continuous("K-L divergence") ```