taxonomy-methods: Functions for accessing taxonomic data stored in 'rowData'.

taxonomy-methodsR Documentation

Functions for accessing taxonomic data stored in rowData.

Description

These function work on data present in rowData and define a way to represent taxonomic data alongside the features of a SummarizedExperiment.

Usage

TAXONOMY_RANKS

taxonomyRanks(x)

## S4 method for signature 'SummarizedExperiment'
taxonomyRanks(x)

taxonomyRankEmpty(
  x,
  rank = taxonomyRanks(x)[1L],
  empty.fields = c(NA, "", " ", "\t", "-", "_")
)

## S4 method for signature 'SummarizedExperiment'
taxonomyRankEmpty(
  x,
  rank = taxonomyRanks(x)[1],
  empty.fields = c(NA, "", " ", "\t", "-", "_")
)

checkTaxonomy(x, ...)

## S4 method for signature 'SummarizedExperiment'
checkTaxonomy(x)

setTaxonomyRanks(ranks)

getTaxonomyRanks()

getTaxonomyLabels(x, ...)

## S4 method for signature 'SummarizedExperiment'
getTaxonomyLabels(
  x,
  empty.fields = c(NA, "", " ", "\t", "-", "_"),
  with_rank = FALSE,
  make_unique = TRUE,
  resolve_loops = FALSE,
  ...
)

mapTaxonomy(x, ...)

## S4 method for signature 'SummarizedExperiment'
mapTaxonomy(x, taxa = NULL, from = NULL, to = NULL, use_grepl = FALSE)

IdTaxaToDataFrame(from)

Arguments

x

a SummarizedExperiment object

rank

a single character defining a taxonomic rank. Must be a value of taxonomyRanks() function

empty.fields

a character value defining, which values should be regarded as empty. (Default: c(NA, "", " ", "\t")). They will be removed if na.rm = TRUE before agglomeration

...

optional arguments not used currently.

ranks

Avector of ranks to be set

with_rank

TRUE or FALSE: Should the level be add as a suffix? For example: "Phylum:Crenarchaeota" (default: with_rank = FALSE)

make_unique

TRUE or FALSE: Should the labels be made unique, if there are any duplicates? (default: make_unique = TRUE)

resolve_loops

TRUE or FALSE: Should resolveLoops be applied to the taxonomic data? Please note that has only an effect, if the data is unique. (default: resolve_loops = TRUE)

taxa

a character vector, which is used for subsetting the taxonomic information. If no information is found,NULL is returned for the individual element. (default: NULL)

from
  • For mapTaxonomy: a scalar character value, which must be a valid taxonomic rank. (default: NULL)

  • otherwise a Taxa object as returned by IdTaxa

to

a scalar character value, which must be a valid taxonomic rank. (default: NULL)

use_grepl

TRUE or FALSE: should pattern matching via grepl be used? Otherwise literal matching is used. (default: FALSE)

Format

a character vector of length 8 containing the taxonomy ranks recognized. In functions this is used as case insensitive.

Details

taxonomyRanks returns, which columns of rowData(x) are regarded as columns containing taxonomic information.

taxonomyRankEmpty checks, if a selected rank is empty of information.

checkTaxonomy checks, if taxonomy information is valid and whether it contains any problems. This is a soft test, which reports some diagnostic and might mature into a data validator used upon object creation.

getTaxonomyLabels generates a character vector per row consisting of the lowest taxonomic information possible. If data from different levels, is to be mixed, the taxonomic level is prepended by default.

IdTaxaToDataFrame extracts taxonomic results from results of IdTaxa.

mapTaxonomy maps the given features (taxonomic groups; taxa) to the specified taxonomic level (to argument) in rowData of the SummarizedExperiment data object (i.e. rowData(x)[,taxonomyRanks(x)]). If the argument to is not provided, then all matching taxonomy rows in rowData will be returned. This function allows handy conversions between different

Taxonomic information from the IdTaxa function of DECIPHER package are returned as a special class. With as(taxa,"DataFrame") the information can be easily converted to a DataFrame compatible with storing the taxonomic information a rowData. Please note that the assigned confidence information are returned as metatdata and can be accessed using metadata(df)$confidence.

Value

  • taxonomyRanks: a character vector with all the taxonomic ranks found in colnames(rowData(x))

  • taxonomyRankEmpty: a logical value

  • mapTaxonomy: a list per element of taxa. Each element is either a DataFrame, a character or NULL. If all character results have the length of one, a single character vector is returned.

See Also

agglomerateByRank, toTree, resolveLoop

Examples

data(GlobalPatterns)
GlobalPatterns
taxonomyRanks(GlobalPatterns)

checkTaxonomy(GlobalPatterns)

table(taxonomyRankEmpty(GlobalPatterns,"Kingdom"))
table(taxonomyRankEmpty(GlobalPatterns,"Species"))

getTaxonomyLabels(GlobalPatterns[1:20,])

# mapTaxonomy
## returns the unique taxonomic information
mapTaxonomy(GlobalPatterns)
# returns specific unique taxonomic information
mapTaxonomy(GlobalPatterns, taxa = "Escherichia")
# returns information on a single output
mapTaxonomy(GlobalPatterns, taxa = "Escherichia",to="Family")

# setTaxonomyRanks
tse <- GlobalPatterns
colnames(rowData(tse))[1] <- "TAXA1"

setTaxonomyRanks(colnames(rowData(tse)))
# Taxonomy ranks set to: taxa1 phylum class order family genus species 

# getTaxonomyRanks is to get/check if the taxonomic ranks is set to "TAXA1"
getTaxonomyRanks()

microbiome/mia documentation built on April 27, 2024, 4:04 a.m.