Taxa-class: Taxa training and testing objects
In DECIPHER: Tools for curating, analyzing, and manipulating biological sequences

Description Usage Arguments Details Author(s) See Also Examples

Taxonomic classification is the process of assigning an organism a label that is part of a taxonomic hierarchy (e.g., Phylum, Class, Order, Family, Genus). Here, labels are assigned based on an organism's DNA or RNA sequence at a rank level determined by the classification's confidence. Class Taxa provides objects and functions for storing and viewing training and testing objects used in taxonomic classification.

## S3 method for class 'Taxa'
plot(x,
    y = NULL,
    showRanks = TRUE,
    n = NULL,
    ...)

## S3 method for class 'Taxa'
print(x,
     ...)

## S3 method for class 'Taxa'
x[i, j, threshold]

`x`	An object of class `Taxa` with subclass `Train` or `Test`.
`y`	An (optional) object of class `Taxa` with the opposite subclass as `x`.
`showRanks`	Logical specifying whether to show all rank levels when plotting an object of class `Taxa` and subclass `Test`. If `TRUE` (the default), then ranks are shown as (colored) concentric rings with radial lines delimiting taxa boundaries.
`n`	Numeric vector giving the frequency of each classification if `x` or `y` is an object of subclass `Test`, or the default (`NULL`) to treat all classifications as occurring once. Typically, specifying `n` is useful when the classifications represent varying numbers of observations, e.g., when only unique sequences were originally classified.
`...`	Other optional parameters.
`i`	Numeric or character vector of indices to extract from objects of class `Taxa` with subclass `Test`.
`j`	Numeric or character vector of rank levels to extract from objects of class `Taxa` with subclass `Test`.
`threshold`	Numeric specifying the confidence `threshold` at which to truncate the output taxonomic classifications. Note that `threshold` must be higher than the original for the classifications to change.

Objects of class Taxa are stored as lists, and can have either subclass Train or Test. The function LearnTaxa returns an object of subclass Train, while the function IdTaxa can return an object of class Test.

Training objects are built from a set of reference sequences with known taxonomic classifications. List elements contain information required by IdTaxa for assigning a classification to test sequences.

Testing objects can be generated by IdTaxa from a Training object and a set of test sequences. Each list element contains the taxon, confidence, and (optionally) rank name of the taxonomic assignment.

The information stored in Taxa can be visualized with the plot function or displayed with print. Only objects of subclass Train can be subsetted without losing their class.

Erik Wright eswright@pitt.edu

LearnTaxa, IdTaxa

data("TrainingSet_16S")
plot(TrainingSet_16S)

# import test sequences
fas <- system.file("extdata", "Bacteria_175seqs.fas", package="DECIPHER")
dna <- readDNAStringSet(fas)

# remove any gaps in the sequences
dna <- RemoveGaps(dna)

# classify the test sequences
ids <- IdTaxa(dna, TrainingSet_16S, strand="top")
ids

plot(ids) # plot all rank levels
plot(ids[, 1:4]) # plot the first rank levels
plot(ids[j=c("rootrank", "class", "genus")]) # plot specific rank levels
plot(ids[threshold=70]) # plot high confidence classifications