View source: R/nested_top_taxa.R
nested_top_taxa | R Documentation |
This function identifies the top n
named taxa and the top m
named taxa at a
nested level in a phyloseq object. Users specify the
summary statistic that is used to rank the taxa, e.g. sum
, mean
or
median
. Furthermore, it is possible to add one or more grouping
factors from the tax_table
to get group-specific top n,m
taxa.
nested_top_taxa(
ps_obj,
top_tax_level,
nested_tax_level,
n_top_taxa = 1,
n_nested_taxa = 1,
top_merged_label = "Other",
nested_merged_label = "Other <tax>",
by_proportion = T,
...
)
ps_obj |
A phyloseq object with an |
top_tax_level |
The name of the top taxonomic rank in the phyloseq object |
nested_tax_level |
The name of the nested taxonomic rank in the phyloseq object |
n_top_taxa |
The number of top taxa to identify at the top level. |
n_nested_taxa |
The number of top taxa to identify at the nested level. For ASVs, specify "ASV" |
top_merged_label |
Label to assign to the merged top_tax_level taxa |
nested_merged_label |
Label to assign to the merged nested_tax_level taxa |
by_proportion |
Converts absolute abundances to proportions before
calculating the summary statistic (default = |
... |
Additional arguments to be passed |
This function first finds the top n
named taxa at the top level, after
which it merges all other top_taxa level into a single taxon with the
merged_label
annotation. Next, it loops through each remaining top level taxon,
and identifies the top m
named taxa at the nested level. If
\le m
taxa are available, it will only return those taxa. If more are
available, it will merge all non-top-taxa into a single taxon with the
merged_label
annotation, together with its top level annotation.
If no named taxa are available at the nested_level, all taxa will be merged
into a single taxon with merged_label
annotation, together with its
top_level
annotation. Thus, the merged_label
taxon overall and
in each group represents the combination of taxa without an annotation and
taxa with an annotation that were not in the top (n,m)
abundant taxa.
If nested_tax_level = "ASV"
, row.names(tax_table(ps_obj))
will
be added as an ASV column to the tax_table, unless this column already exists.
The top taxa can be identified based on the absolute abundances or proportions.
When using absolute abundances, please make sure to normalize or rarefy the data
before using this function. If by_proportion = TRUE
, abundances will
be converted to relative abundance before applying FUN
.
A list in which top_taxa
is a tibble with the rank, taxon id, grouping
factors, abundance summary statistic and taxonomy of the top taxa and ps_obj
is the phyloseq object after collapsing all non-top taxa.
data(GlobalPatterns)
# Top 3 most abundant orders, top 3 most abundant families over all samples,
# using the mean as the aggregation function
nested_top_taxa(GlobalPatterns, top_tax_level = "Order", nested_tax_level
= "Family", n_top_taxa = 3, n_nested_taxa = 3,
FUN = mean, na.rm = T)
#' # Top 1 most abundant genera, top 2 most abundant species per SampleType,
# using the median as the aggregation function
nested_top_taxa(GlobalPatterns, top_tax_level = "Genus", nested_tax_level
= "Species", n_top_taxa = 1, n_nested_taxa = 2, grouping = "SampleType",
FUN = median)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.