Description Usage Arguments Details Value Note Author(s) References See Also Examples
View source: R/FindTrustKmer.R
The counted kmer frequency file calculated from whole genome sequencing data using jellyfish count
function contains large amount of kmers with low frequency. These kmers are considered untrusted kmers from sequencing errors.
FrindTrustKmer
detect the starting point of trusted kmer and calculate the total number of untrusted and trusted kmers in the dataset.
1 | FindTrustKmer(file, kmer_len)
|
file |
Counted kmer frequency file from |
kmer_len |
The length of kmers. |
This function takes the output from jellyfish count
function and the column of the data have to be renamed into "frequency" and "counts" respectively.
The function will first detect the trust kmers starting point, and subsequently calculate the total number of trusted and untrusted kmers at various frequency, thus generating the percentage of trusted and untrusted kmers in the data set.
This function will return two values.
The first value is the starting point of trusted kmers. This value will be useful in the PlotKmerFrequency
function.
The second value is a vector which contains the total number of all kmers, trusted kmers, and untrusted kmers, as well as the percentage of trusted and untrusted kmers.
N/A
Qiong Liu
More details about calculation can be referred at:
http://koke.asrc.kanazawa-u.ac.jp/HOWTO/kmer-genomesize.html
https://bioinformatics.uconn.edu/genome-size-estimation-tutorial/
Function GenomeEstimate
and PlotKmerFrequency
.
1 2 3 4 5 6 7 8 9 | # load the example data. This gives you an example data called a with kmer length 19bp
data(a)
FindTrustKmer(a,19)
# load the example data. This gives you an example data called b with kmer length 30bp
data(b)
FindTrustKmer(b,30)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.