getDomain: Extractor functions for QIIME taxonomy

View source: R/extractor.R

getDomainR Documentation

Extractor functions for QIIME taxonomy

Description

Extracting taxonomic information from FASTA headers.

Usage

getDomain(header)
getPhylum(header)
getClass(header)
getOrder(header)
getFamily(header)
getGenus(header)
getTag(header)
getTaxonomy(header)

Arguments

header

A vector of texts, the Header column from reading a FASTA file, containing taxonomy information in the proper format.

Details

The ConTax data sets (see package microcontx) are FASTA files where the Header contains texts according to a strict format inherited from QIIME:

It always starts with a short text, a Tag, which is a unique identifier for every sequence. The function getTag will extract this from the header.

After the Tag follows one or more tokens. One of these tokens is a string with the following format, inherited from QIIME:

"k__<...>;p__<...>;c__<...>;o__<...>;f__<...>;g__<...>;"

where <...> is some proper text. Here is an example of a proper string:

"k__Bacteria;p__Firmicutes;c__Bacilli;o__Bacillales;f__Staphylococcaceae;g__Staphylococcus;"

The functions getDomain, ..., getGenus extracts the corresponding information from the header. getTaxonomy combines all taxonomy extractors, combines these in a table and imputes missing taxa with parent taxa.

Value

A vector containing the sub-texts extracted from each header text, but getTaxonomy returns a table with the full taxonomy, one row for each input header

Author(s)

Lars Snipen.

See Also

contax.trim, medoids.

Examples

data(contax.trim)
getTag(contax.trim$Header)
getGenus(contax.trim$Header)
getPhylum(contax.trim$Header)


larssnip/microclass documentation built on Nov. 1, 2023, 2:39 p.m.