agglomerateByPrevalence: Agglomerate data based on population prevalence
In FelixErnst/mia: Microbiome analysis

agglomerateByPrevalence

R Documentation

Agglomerate data based on population prevalence

Description

Agglomerate data based on population prevalence

Usage

agglomerateByPrevalence(x, ...)

## S4 method for signature 'SummarizedExperiment'
agglomerateByPrevalence(
  x,
  rank = NULL,
  other.name = other_label,
  other_label = "Other",
  ...
)

## S4 method for signature 'TreeSummarizedExperiment'
agglomerateByPrevalence(
  x,
  rank = NULL,
  other.name = other_label,
  other_label = "Other",
  update.tree = TRUE,
  ...
)

Arguments

`x`	`TreeSummarizedExperiment`.
`...`	arguments passed to `agglomerateByRank` function for `SummarizedExperiment` objects and other functions. See `agglomerateByRank` for more details.
`rank`	`Character scalar`. Defines a taxonomic rank. Must be a value of `taxonomyRanks()` function.
`other.name`	`Character scalar`. Used as the label for the summary of non-prevalent taxa. (default: `"Other"`)
`other_label`	Deprecated. use `other.name` instead.
`update.tree`	`Logical scalar`. Should `rowTree()` also be merged? (Default: `TRUE`)

Details

agglomerateByPrevalence sums up the values of assays at the taxonomic level specified by rank (by default the highest taxonomic level available) and selects the summed results that exceed the given population prevalence at the given detection level. The other summed values (below the threshold) are agglomerated in an additional row taking the name indicated by other.name (by default "Other").

Value

agglomerateByPrevalence returns a taxonomically-agglomerated object of the same class as x and based on prevalent taxonomic results.

Examples

## Data can be aggregated based on prevalent taxonomic results
data(GlobalPatterns)
tse <- GlobalPatterns
tse <- transformAssay(tse, method = "relabundance")
tse <- agglomerateByPrevalence(
    tse,
    rank = "Phylum",
    assay.type = "relabundance",
    detection = 1/100,
    prevalence = 50/100)

tse

# Here data is aggregated at the taxonomic level "Phylum". The five phyla
# that exceed the population prevalence threshold of 50/100 represent the
# five first rows of the assay in the aggregated data. The sixth and last row
# named by default "Other" takes the summed up values of all the other phyla
# that are below the prevalence threshold.

assay(tse)[,1:5]

FelixErnst/mia documentation built on July 16, 2025, 8:08 p.m.