summary: Summarizing microbiome data

summarizeDominanceR Documentation

Summarizing microbiome data

Description

To query a SummarizedExperiment for interesting features, several functions are available.

Usage

summarizeDominance(x, group = NULL, name = "dominant_taxa", ...)

getUnique(x, ...)

getTop(
  x,
  top = 5L,
  method = c("mean", "sum", "median"),
  assay.type = assay_name,
  assay_name = "counts",
  na.rm = TRUE,
  ...
)

## S4 method for signature 'SummarizedExperiment'
getTop(
  x,
  top = 5L,
  method = c("mean", "sum", "median", "prevalence"),
  assay.type = assay_name,
  assay_name = "counts",
  na.rm = TRUE,
  ...
)

## S4 method for signature 'SummarizedExperiment'
getUnique(x, rank = NULL, ...)

## S4 method for signature 'SummarizedExperiment'
summarizeDominance(x, group = NULL, name = "dominant_taxa", ...)

## S4 method for signature 'SummarizedExperiment'
summary(object, assay.type = assay_name, assay_name = "counts")

Arguments

x

TreeSummarizedExperiment.

group

With group, it is possible to group the observations in an overview. Must be one of the column names of colData.

name

Character scalar. A name for the column of the colData where results will be stored. (Default: "dominant_taxa")

...

Additional arguments passed on to agglomerateByRank() when rank is specified for summarizeDominance.

top

Numeric scalar. Determines how many top taxa to return. Default is to return top five taxa. (Default: 5)

method

Character scalar. Specify the method to determine top taxa. Either sum, mean, median or prevalence. (Default: "mean")

assay.type

Character scalar. Specifies the name of the assay used in calculation. (Default: "counts")

assay_name

Deprecated. Use assay.type instead.

na.rm

Logical scalar. Should NA values be omitted? (Default: TRUE)

rank

Character scalar. Defines a taxonomic rank. Must be a value of the output of taxonomyRanks(). (Default: NULl)

object

A SummarizedExperiment object.

Details

The getTop extracts the most top abundant “FeatureID”s in a SummarizedExperiment object.

The getUnique is a basic function to access different taxa at a particular taxonomic rank.

summarizeDominance returns information about most dominant taxa in a tibble. Information includes their absolute and relative abundances in whole data set.

The summary will return a summary of counts for all samples and features in SummarizedExperiment object.

Value

The getTop returns a vector of the most top abundant “FeatureID”s

The getUnique returns a vector of unique taxa present at a particular rank

The summarizeDominance returns an overview in a tibble. It contains dominant taxa in a column named *name* and its abundance in the data set.

The summary returns a list with two tibbles

See Also

getPrevalent

perCellQCMetrics, perFeatureQCMetrics, addPerCellQC, addPerFeatureQC, quickPerCellQC

Examples

data(GlobalPatterns)
top_taxa <- getTop(GlobalPatterns,
                       method = "mean",
                       top = 5,
                       assay.type = "counts")
top_taxa

# Use 'detection' to select detection threshold when using prevalence method
top_taxa <- getTop(GlobalPatterns,
                       method = "prevalence",
                       top = 5,
                       assay_name = "counts",
                       detection = 100)
top_taxa
                       
# Top taxa os specific rank
getTop(agglomerateByRank(GlobalPatterns,
                             rank = "Genus",
                             na.rm = TRUE))

# Gets the overview of dominant taxa
dominant_taxa <- summarizeDominance(GlobalPatterns,
                                   rank = "Genus")
dominant_taxa

# With group, it is possible to group observations based on specified groups
# Gets the overview of dominant taxa
dominant_taxa <- summarizeDominance(GlobalPatterns,
                                   rank = "Genus",
                                   group = "SampleType",
                                   na.rm = TRUE)

dominant_taxa

# Get an overview of sample and taxa counts
summary(GlobalPatterns, assay.type= "counts")

# Get unique taxa at a particular taxonomic rank
# sort = TRUE means that output is sorted in alphabetical order
# With na.rm = TRUE, it is possible to remove NAs
# sort and na.rm can also be used in function getTop
getUnique(GlobalPatterns, "Phylum", sort = TRUE)


FelixErnst/mia documentation built on Jan. 19, 2025, 2:30 a.m.