classify_sequence | R Documentation |
The classify_sequence()
function implements the Wang et al. naive Bayesian
classification algorithm for 16S rRNA gene sequences.
classify_sequence(
unknown_sequence,
database,
kmer_size = 8,
num_bootstraps = 100
)
unknown_sequence |
A character object representing a DNA sequence that needs to be classified |
database |
A kmer database generated using |
kmer_size |
An integer value (default of 8) indicating the size of kmers to use for classifying sequences. Higher values use more RAM with potentially more specificity Lower values use less RAM with potentially less specificity. Benchmarking has found that the default of 8 provides the best specificity with the lowest possible memory requirement and fastest execution time. |
num_bootstraps |
An integer value (default of 100). The value of
|
A list object of two vectors. One vector (taxonomy
) is the
taxonomic assignment for each level. The second vector
(confidence
) is the percentage of num_bootstraps
that the
classifier gave the same classification at that level
Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007 Aug;73(16):5261-7. doi:10.1128/AEM.00062-07 PMID: 17586664; PMCID: PMC1950982.
kmer_size <- 3
sequences <- c("ATGCGCTA", "ATGCGCTC", "ATGCGCTC")
genera <- c("A", "B", "B")
db <- build_kmer_database(sequences, genera, kmer_size)
unknown_sequence <- "ATGCGCTC"
classify_sequence(
unknown_sequence = unknown_sequence,
database = db,
kmer_size = kmer_size
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.