summaries: Summarizing microbiome data

summariesR Documentation

Summarizing microbiome data

Description

To query a SummarizedExperiment for interesting features, several functions are available.

Usage

getTopFeatures(
  x,
  top = 5L,
  method = c("mean", "sum", "median"),
  assay.type = assay_name,
  assay_name = "counts",
  na.rm = TRUE,
  ...
)

## S4 method for signature 'SummarizedExperiment'
getTopFeatures(
  x,
  top = 5L,
  method = c("mean", "sum", "median", "prevalence"),
  assay.type = assay_name,
  assay_name = "counts",
  na.rm = TRUE,
  ...
)

getTopTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
getTopTaxa(x, ...)

getUniqueFeatures(x, ...)

## S4 method for signature 'SummarizedExperiment'
getUniqueFeatures(x, rank = NULL, ...)

getUniqueTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
getUniqueTaxa(x, ...)

countDominantFeatures(x, group = NULL, name = "dominant_taxa", ...)

## S4 method for signature 'SummarizedExperiment'
countDominantFeatures(x, group = NULL, name = "dominant_taxa", ...)

countDominantTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
countDominantTaxa(x, ...)

## S4 method for signature 'SummarizedExperiment'
summary(object, assay.type = assay_name, assay_name = "counts")

Arguments

x

A SummarizedExperiment object.

top

Numeric value, how many top taxa to return. Default return top five taxa.

method

Specify the method to determine top taxa. Either sum, mean, median or prevalence. Default is 'mean'.

assay.type

a character value to select an assayNames By default it expects count data.

assay_name

a single character value for specifying which assay to use for calculation. (Please use assay.type instead. At some point assay_name will be disabled.)

na.rm

For getTopFeatures logical argument for calculation method specified to argument method. Default is TRUE.

...

Additional arguments passed on to agglomerateByRank() when rank is specified for countDominantFeatures.

rank

A single character defining a taxonomic rank. Must be a value of the output of taxonomyRanks().

group

With group, it is possible to group the observations in an overview. Must be one of the column names of colData.

name

The column name for the features. The default is 'dominant_taxa'.

object

A SummarizedExperiment object.

Details

The getTopFeatures extracts the most top abundant “FeatureID”s in a SummarizedExperiment object.

The getUniqueFeatures is a basic function to access different taxa at a particular taxonomic rank.

countDominantFeatures returns information about most dominant taxa in a tibble. Information includes their absolute and relative abundances in whole data set.

The summary will return a summary of counts for all samples and features in SummarizedExperiment object.

Value

The getTopFeatures returns a vector of the most top abundant “FeatureID”s

The getUniqueFeatures returns a vector of unique taxa present at a particular rank

The countDominantFeatures returns an overview in a tibble. It contains dominant taxa in a column named *name* and its abundance in the data set.

The summary returns a list with two tibbles

Author(s)

Leo Lahti, Tuomas Borman and Sudarshan A. Shetty

See Also

getPrevalentFeatures

perCellQCMetrics, perFeatureQCMetrics, addPerCellQC, addPerFeatureQC, quickPerCellQC

Examples

data(GlobalPatterns)
top_taxa <- getTopFeatures(GlobalPatterns,
                       method = "mean",
                       top = 5,
                       assay.type = "counts")
top_taxa

# Use 'detection' to select detection threshold when using prevalence method
top_taxa <- getTopFeatures(GlobalPatterns,
                       method = "prevalence",
                       top = 5,
                       assay_name = "counts",
                       detection = 100)
top_taxa
                       
# Top taxa os specific rank
getTopFeatures(agglomerateByRank(GlobalPatterns,
                             rank = "Genus",
                             na.rm = TRUE))

# Gets the overview of dominant taxa
dominant_taxa <- countDominantFeatures(GlobalPatterns,
                                   rank = "Genus")
dominant_taxa

# With group, it is possible to group observations based on specified groups
# Gets the overview of dominant taxa
dominant_taxa <- countDominantFeatures(GlobalPatterns,
                                   rank = "Genus",
                                   group = "SampleType",
                                   na.rm = TRUE)

dominant_taxa

# Get an overview of sample and taxa counts
summary(GlobalPatterns, assay_name= "counts")

# Get unique taxa at a particular taxonomic rank
# sort = TRUE means that output is sorted in alphabetical order
# With na.rm = TRUE, it is possible to remove NAs
# sort and na.rm can also be used in function getTopFeatures
getUniqueFeatures(GlobalPatterns, "Phylum", sort = TRUE)


microbiome/mia documentation built on April 27, 2024, 4:04 a.m.