PlotKmerFrequency: Plot the kmer frequency distribution

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/PlotKmerFrequency.R

Description

This function will utilize part of the output from both function GenomeEstimate and FindTrustKmer and plot the distribution of counted kmers at various frequencies.

Additionally, it will indicate the mean coverage of kmers, as well as the theoretical poisson distribution of kmer frequency with the mean of kmer coverage.

Usage

1
PlotKmerFrequency(file, kmer_len, start_point, peak, end_point)

Arguments

file

Counted kmer frequency file from jellyfish count function. The first and second columns have to be names as "frequency" and "counts" respectively.

kmer_len

The length of kmers.

start_point

The starting point of trusted kmers. Get this value trhough function FindTrustKmer(file,kmer_len).

peak

The mean coverage of kmers. Get this value through function GenomeEstimate(file,kmer_len).

end_point

The end point of single copy region. Get this value through function GenomeEstimate(file,kmer_len).

Details

This function takes the output from jellyfish count function and the column of the data have to be renamed into "frequency" and "counts" respectively.

The function will utilize the output (The starting point of trusted kmer, mean coverage of kmer, and ending point of single copy region) from function FindTrustKmer and GenomeEstimate. The theoretical poission distribution of counted kmer frequency is based on the calcualted single copy region and the mean coverage of kmer.

Value

This function will return a plot of counted kmer frequency distribution. Function PlotKmerFrequency also give a theoratical poisson distribution with a mean of mean coverage of kmers.

Note

N/A

Author(s)

Qiong Liu

References

More details about calculation can be referred at:

http://koke.asrc.kanazawa-u.ac.jp/HOWTO/kmer-genomesize.html

https://bioinformatics.uconn.edu/genome-size-estimation-tutorial/

See Also

Function FindTrustKmer and GenomeEstimate

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# load an example data called b
data(b)

# Get startingpoint of trusted kmer
FindTrustKmer(b,30)

# Get the mean covergae of kmer, ending point of single copy region
GenomeEstimate(b,30)

# Plot the figure
PlotKmerFrequency(file=b,kmer_len=30,start_point = 13,peak=36,end_point = 80)

qiongliu1023/GenomeSizeEstimate documentation built on May 14, 2019, 3 a.m.