poyuliu/KTU: “KTU” (K-mer Taxonomic Unit) algorithm improves biological relevance in microbiome associated study

The 16S rDNA amplicon sequencing is widely implemented for microbiome associated studies. Microbiota feature picking algorithms for taxonomic identification and quantification are greatly renewing in recent years. The amplicon sequence variant (ASV) denoising algorithm of unbiased sequence picking replaces the OTU clustering methods. The ASV features can detect and distinguish the biological variations under the species OTU level (≧97% similarity). However, the quantification of single ASV among sequencing samples are sparse and less prevalent in the same biological groups. Here, we introduce a k-mer based, sequence alignment-free algorithm – “KTU” (K-mer Taxonomic Unit) for re-clustering ASVs into taxonomic units with more biological relevance. The “KTU” algorithm was designed with four parts (k-mer frequency counting, k-mer present score or frequency similarity measurement, k-mer feature partitioning, and generating KTU table) and conducted in the R environment. The k-mer frequency counting was conducted by tetranucleotide frequency of amplicons; 256 tetranucleotide compositions were then converted to a 0-to-1 proportion. The similarities of k-mer frequency among amplicons were measured by cosine similarity. The similarity matrix then was converted to the distance matrix for the subsequent step. The KTUs were detected from the cosine distance matrix by using partition around medoids (PAM) clustering algorithm; the iterative PAM-KTU detecting process found the optimal cluster numbers of KTUs. The final step of the KTU algorithm was aggregating ASVs into the KTUs and generating the KTU table.

Getting started

Package details

AuthorPo-Yu Liu
MaintainerPo-Yu Liu <poyu.liu@gmail.com>
Licensenone
Version1.0.3
Package repositoryView on GitHub
Installation Install the latest version of this package by entering the following in R:
install.packages("remotes")
remotes::install_github("poyuliu/KTU")
poyuliu/KTU documentation built on May 23, 2022, 11:06 a.m.