agglomerateByPrevalence: Agglomerate data based on population prevalence

agglomerateByPrevalenceR Documentation

Agglomerate data based on population prevalence

Description

Agglomerate data based on population prevalence

Usage

agglomerateByPrevalence(x, ...)

## S4 method for signature 'SummarizedExperiment'
agglomerateByPrevalence(
  x,
  rank = NULL,
  other.name = other_label,
  other_label = "Other",
  ...
)

## S4 method for signature 'TreeSummarizedExperiment'
agglomerateByPrevalence(
  x,
  rank = NULL,
  other.name = other_label,
  other_label = "Other",
  update.tree = FALSE,
  ...
)

Arguments

x

TreeSummarizedExperiment.

...

arguments passed to agglomerateByRank function for SummarizedExperiment objects and other functions. See agglomerateByRank for more details.

rank

Character scalar. Defines a taxonomic rank. Must be a value of taxonomyRanks() function.

other.name

Character scalar. Used as the label for the summary of non-prevalent taxa. (default: "Other")

other_label

Deprecated. use other.name instead.

update.tree

Logical scalar. Should rowTree() also be merged? (Default: FALSE)

Details

agglomerateByPrevalence sums up the values of assays at the taxonomic level specified by rank (by default the highest taxonomic level available) and selects the summed results that exceed the given population prevalence at the given detection level. The other summed values (below the threshold) are agglomerated in an additional row taking the name indicated by other.name (by default "Other").

Value

agglomerateByPrevalence returns a taxonomically-agglomerated object of the same class as x and based on prevalent taxonomic results.

Examples

## Data can be aggregated based on prevalent taxonomic results
data(GlobalPatterns)
tse <- GlobalPatterns
tse <- transformAssay(tse, method = "relabundance")
tse <- agglomerateByPrevalence(
    tse,
    rank = "Phylum",
    assay.type = "relabundance",
    detection = 1/100,
    prevalence = 50/100)

tse

# Here data is aggregated at the taxonomic level "Phylum". The five phyla
# that exceed the population prevalence threshold of 50/100 represent the
# five first rows of the assay in the aggregated data. The sixth and last row
# named by default "Other" takes the summed up values of all the other phyla
# that are below the prevalence threshold.

assay(tse)[,1:5]


FelixErnst/mia documentation built on Nov. 18, 2024, 5:02 a.m.