rdp | R Documentation |
Use the RDP classifier (Wang et al, 2007) to classify 16S rRNA sequences.
This package contains
currently RDP version 2.14 released in August 2023. The associated data
package rRDPData
contains models trained on
the bacterial and archaeal taxonomy training set No. 19 (see Wang and
Cole, 2024).
rdp(dir = NULL)
## S3 method for class 'RDPClassifier'
predict(object, newdata, confidence = 0.8, rdp_args = "", verbose = FALSE, ...)
trainRDP(x, dir = "classifier", rank = "genus", verbose = FALSE)
removeRDP(object)
dir |
directory where the classifier information is stored. |
object |
a RDPClassifier object. |
newdata |
new data to be classified as a Biostrings::DNAStringSet. |
confidence |
numeric; minimum confidence level for classification. Results with lower confidence are replaced by NAs. Set to 0 to disable. |
rdp_args |
additional RDP arguments for classification (e.g.,
|
verbose |
logical; print additional information. |
... |
additional arguments (currently unused). |
x |
an object of class Biostrings::DNAStringSet with the 16S rRNA sequences for training. |
rank |
Taxonomic rank at which the classification is learned. |
RDP is a naive Bayes classifier using 8-mers as features.
rdp()
creates a default classifier trained with the data shipped with
RDP. Alternatively, a directory with the data for an existing classifier
(created with trainRDP()
) can be supplied.
trainRDP()
creates a new classifier for the data in x
and
stores the classifier information in dir
. The data in x
needs
to have annotations in the following format:
"<ID> <Kingdom>;<Phylum>;<Class>;<Order>;<Family>;<Genus>"
A created classifier can be removed with removeRDP()
. This will
remove the directory which stores the classifier information.
The data for the default 16S rRNA classifier can be found in package
rRDPData
.
rdp()
and trainRDP()
return a RDPClassifier
object.
predict()
returns a data.frame containing the classification results
for each sequence (rows). The data.frame has an attribute called
"confidence"
with a matrix containing the confidence values.
Hahsler M, Nagar A (2020). "rRDP: Interface to the RDP Classifier." R Package, Bioconductor. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18129/B9.bioc.rRDP")}.
RDP classifier software: https://sourceforge.net/projects/rdp-classifier/
Qiong Wang, George M. Garrity, James M. Tiedje and James R. Cole. Naive Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial Taxonomy, Appl. Environ. Microbiol. August 2007 vol. 73 no. 16 5261-5267. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1128/AEM.00062-07")}
Qiong W. and Cole J.R. Updated RDP taxonomy and RDP Classifier for more accurate taxonomic classification, Microbial Ecology, Announcement, 4 March 2024. \Sexpr[results=rd]{tools:::Rd_expr_doi("https://doi.org/10.1128/mra.01063-23")}
### Use the default classifier
seq <- readRNAStringSet(system.file("examples/RNA_example.fasta",
package = "rRDP"
))
## shorten names
names(seq) <- sapply(strsplit(names(seq), " "), "[", 1)
seq
## use rdp for classification (this needs package rRDPData installed)
## > BiocManager::install("rRDPData")
cl_16S <- rdp()
cl_16S
pred <- predict(cl_16S, seq)
pred
attr(pred, "confidence")
### Train a custom RDP classifier on new data
trainingSequences <- readDNAStringSet(
system.file("examples/trainingSequences.fasta", package = "rRDP")
)
customRDP <- trainRDP(trainingSequences)
customRDP
testSequences <- readDNAStringSet(
system.file("examples/testSequences.fasta", package = "rRDP")
)
predict(customRDP, testSequences)
## clean up
removeRDP(customRDP)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.