composition_df: Prepare a Data Frame for Composition Plots

View source: R/composition_plots.R

composition_dfR Documentation

Prepare a Data Frame for Composition Plots

Description

This function takes a phyloseq object and prepares a data frame that can be used to generate composition plots with ggplot2. Works similarly to the ggformat function in phyloseq.extended.

Usage

composition_df(psobj, rank = "Family",
               keepcols = c("Sample", "Group", "Label", "ID"),
               minprop = 0.05, mean_across_samples = NULL)

Arguments

psobj

A phyloseq object containing raw, potentially agglomerated, counts.

rank

The taxonomic rank across which taxa counts shoud be summed.

keepcols

Names of columns from sample_data(psobj)

minprop

Threshold for showing a taxon vs. lumping it into "Other". At least one sample must have at least this proportion of the OTU counts assigned to a given taxon for that taxon to be displayed.

mean_across_samples

An optional grouping variable, indicated as a character string matching one of the column names in keepcols. If provided, samples are lumped within groups. Proportions are averaged across samples, and counts are summed across samples.

Value

A tibble with the following columns:

  • A column labeled ID listing sample names, or if provided, a column named the same as mean_across_samples.

  • A column named the same as rank listing taxa names.

  • A column named Proportion indicating the proportion of OTU counts assigned to a given taxon within a sample or group.

  • A column named Counts indicating the total OTU counts assigned to a given taxon within a sample or group.

  • Any other columns in keepcols. If mean_across_samples is provided, columns are dropped if they have more than one value for a given group.

Author(s)

Lindsay Clark

See Also

findnonmissing is used to determine which taxa to label as “Unclassified”.

Examples

## Not run: 
# Columns to keep
kc <- c("Dog", "Trt", "Day", "Breed", "Group", "Label", "ID")

# Composition plot on individuals, grouped by experimental group
p1 <- composition_df(ps_glom, "Family", minprop = 0.1,
               keepcols = kc) 
  ggplot(aes(x = ID, y = Proportion, fill = Family)) +
  geom_col() +
  facet_wrap(~ Group, scales = "free_x") +
  scale_fill_manual(values = dittoSeq::dittoColors(1)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

ggplotly(p1)

# Composition plot on treatments
p2 <- composition_df(ps_glom, "Family", minprop = 0.1,
               keepcols = kc, mean_across_samples = "Trt") 
  ggplot(aes(x = Trt, y = Proportion, fill = Family)) +
  geom_col() +
  scale_fill_manual(values = dittoSeq::dittoColors(1)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))

ggplotly(p2)

## End(Not run)

HPCBio/plotly_microbiome documentation built on May 9, 2022, 11:37 p.m.