aggregate_taxa_contributions: Aggregate taxa contributions for visualization

View source: R/taxa_contribution.R

aggregate_taxa_contributionsR Documentation

Aggregate taxa contributions for visualization

Description

Core aggregation function that bridges PICRUSt2 contribution data with differential abundance analysis results. Optionally maps ASV/OTU IDs to taxonomic names and filters to significant pathways.

Usage

aggregate_taxa_contributions(
  contrib_data,
  taxonomy = NULL,
  tax_level = "Genus",
  top_n = 10,
  daa_results_df = NULL,
  pathway_ids = NULL,
  p_threshold = 0.05
)

Arguments

contrib_data

A data.frame from read_contrib_file or read_strat_file.

taxonomy

Optional data.frame mapping taxon IDs to taxonomy. Supports QIIME2 format (semicolon-delimited taxonomy strings) or DADA2 format (separate columns for each rank).

tax_level

Character. Taxonomic rank for aggregation. One of "Kingdom", "Phylum", "Class", "Order", "Family", "Genus", "Species". Default "Genus".

top_n

Integer. Number of top taxa to keep; remaining are lumped as "Other". Default 10.

daa_results_df

Optional data.frame from pathway_daa, used to filter contributions to significant pathways.

pathway_ids

Optional character vector of pathway IDs to filter. Alternative to daa_results_df.

p_threshold

Numeric. Significance cutoff when using daa_results_df. Default 0.05.

Details

When daa_results_df is provided, the function:

  1. Extracts significant pathway IDs from the DAA results

  2. Maps pathway IDs to their constituent KO IDs using the internal ko_to_kegg reference

  3. Filters contribution data to only matching KO IDs

Taxonomy can be provided in two formats:

  • QIIME2: A column named Taxon or taxonomy containing semicolon-delimited strings (e.g., "k__Bacteria;p__Firmicutes;...")

  • DADA2: Separate columns for each rank (Kingdom, Phylum, etc.)

Value

A tidy data.frame with columns: sample, function_id, taxon_label, contribution.

Examples


# Basic usage with synthetic data
contrib <- data.frame(
  sample = rep(c("S1", "S2"), each = 6),
  function_id = rep(c("K00001", "K00002", "K00003"), 4),
  taxon = rep(c("ASV1", "ASV2"), each = 3, times = 2),
  taxon_function_abun = runif(12),
  norm_taxon_function_contrib = runif(12)
)
agg <- aggregate_taxa_contributions(contrib, top_n = 2)
head(agg)



ggpicrust2 documentation built on April 10, 2026, 5:06 p.m.