kmer_freq: Measuring positional kmer frequencies

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Given a sample of sequences and corresponding read counts, produce a table giving the position kmer frequencies relative to read starts

Usage

1
kmer.freq(seqs, counts, L = 50, R = 50, k = 1)

Arguments

seqs

a list of DNAString objects.

counts

a list of numeric vectors.

L

how many positions to the left of the read start to consider

R

how many positions to the right of the read start to consider

k

the size of each kmer

Details

Sequences and read counts are used to produce a table of aggregate kmer frequencies for each position relative to the read start. The position on which the read starts is numbered 0, positions to the left of the read are negative, and those to the right are positive.

The sequences and counts can be generated with the provided functions scanFa and count.reads, respectively. The reverse complement of sequences on the negative strand obtained from scanFa should be used. To properly visualize bias a relatively large random sample of intervals should be generated.

Value

A data frame is returned with columns pos, seq, and freq. Where pos gives the position relative to te read start, seq gives the kmer, and freq gives the frequency of that kmer.

Author(s)

Daniel Jones dcjones@cs.washington.edu

See Also

count.reads

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
  library(Rsamtools)
  reads_fn <- system.file( "extra/example.bam", package = "seqbias" )
  ref_fn <- system.file( "extra/example.fa", package = "seqbias" )

  I <- GRanges( c('seq1'), IRanges( c(1), c(5000) ), strand = c('-') )

  ref_f <- FaFile( ref_fn )
  open.FaFile( ref_f )

  seqs <- scanFa( ref_f, I )

  neg_idx <- as.logical( I@strand == '-' )
  seqs[neg_idx] <- reverseComplement( seqs[neg_idx] )

  counts <- count.reads( reads_fn, I )

  freqs <- kmer.freq(seqs, counts, L = 30, R = 30, k = 2)

seqbias documentation built on Nov. 8, 2020, 5:55 p.m.