Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/rdpClassifier.R
Classifying sequences by a trained presence/absence K-mer model.
1 | rdpClassify(sequence, trained.model, post.prob = FALSE, prior = FALSE)
|
sequence |
Character vector of sequences to classify. |
trained.model |
A list with a trained model, see |
post.prob |
Logical indicating if posterior log-probabilities should be returned. |
prior |
Logical indicating if classification should be done by flat priors (default) or with empirical priors (prior=TRUE). |
The classification step of the presence/absence method known as the RDP classifier
(Wang et al 2007) means looking for K-mers on all sequences, and computing the posterior
probabilities for each taxon using a trained model and a naive Bayes assumption. The predicted
taxon is the one producing the maximum posterior probability, for each sequence
.
The classification is parallelized through RcppParallel
employing Intel TBB and TinyThread. By default all available
processing cores are used. This can be changed using the
function setParallel
.
A character vector with the predicted taxa, one for each sequence
.
Kristian Hovde Liland and Lars Snipen.
Wang, Q, Garrity, GM, Tiedje, JM, Cole, JR (2007). Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy. Applied and Enviromental Microbiology, 73: 5261-5267.
1 2 3 4 5 6 7 8 9 10 11 12 | data("small.16S")
seq <- small.16S$Sequence
tax <- sapply(strsplit(small.16S$Header,split=" "),function(x){x[2]})
## Not run:
trn <- rdpTrain(seq,tax)
primer.515f <- "GTGYCAGCMGCCGCGGTAA"
primer.806rB <- "GGACTACNVGGGTWTCTAAT"
reads <- amplicon(seq, primer.515f, primer.806rB)
predicted <- rdpClassify(unlist(reads[nchar(reads)>0]),trn)
print(predicted)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.