trait_clado: Visualize trait enrichment factors within a cladogram based...

View source: R/trait_clado.R

trait_cladoR Documentation

Visualize trait enrichment factors within a cladogram based on taxonomic classification

Description

Taxonomic classification data is used to create a cladogram, which is then annotated with trait enrichment factors.

Usage

trait_clado(
  data,
  formula = ~new_kingdom/new_phylum/new_class/new_order/new_family/new_genus/new_species,
  node_color = TRUE,
  filter = NULL,
  node_all = TRUE,
  ladderize = TRUE,
  continuous = "color",
  layout = "circular",
  node_calc = "mean",
  ...
)

Arguments

data

Enrichment table from fungarium::enrichment (i.e., data.frame containing taxonomic classification variables and enrichment values).

formula

Formula describing how the taxonomic variables (e.g. kingdom, phylum, class, etc.) in your enrichment table should be nested (e.g ~V1/V2/.../Vn). Default: ~new_kingdom/new_phylum/new_class/new_order/new_family/new_genus/new_species.

node_color

Logical. If TRUE (the default), nodes are colored based on the values in trait_ratio.

filter

Character vector specifying data set filtering parameters. Default is NULL. To select taxa with the highest or lowest values for a certain variable (e.g., trait_ratio, trait_freq, freq) use "high-100-trait_ratio", "high-150-trait_freq", "low-200-freq", etc. To set a value threshold filter (i.e., filter out taxa with less than or more than a certain value for trait_ratio, trait_freq, etc.) use "trait_freq>=5", "freq>20", etc. Full vector example: c("freq>=10","high-150-trait_ratio"). This example will filter out all taxa with less than 10 trait_freq and then select the top 150 taxa with the highest trait_ratio values. Note that inequality filters are used prior to selecting "high" or "low" taxa.

node_all

Logical. If TRUE (the default), all taxa in the original input data (before filtering via the filtering parameter) are used for calculating node enrichment values. This becomes important if filter is non NULL.

ladderize

Logical. Should the cladogram be ladderized? Default is TRUE. See ape::ladderize.

continuous

Logical. Type of node annotation? Default is "color". See ggtree::ggtree.

layout

Character string specifying the type of tree layout. Default is "circular". See ggtree::ggtree.

node_calc

Character string defining method for calculating trait enrichment for nodes. If "mean" (the default) is selected, nodes values are calculated based on the mean trait enrichment value for all of the lowest level taxa in that group. If "add" (the default) is selected, nodes values are calculated by dividing the sum of all trait-relevant records by the sum of all records for the lowest level taxa in that group.

...

Additional args passed to ggtree. See ggtree::ggtree.

Details

Cladogram produced using ggtree::ggtree; thus, all ggtree arguments are accepted (e.g., geom_tiplab, geom_tippoint, etc.). Additional ggplot layers (e.g., geom_bar, geom_point, etc.) can be added to the tree as if it were a ggplot object.

Value

      Returns a ggtree object.

Note

Choose filter and node_all values carefully. These values control how enrichment values are displayed for higher level taxa (i.e., nodes). If a filter condition is used and node_all is set to TRUE, node enrichment factors will be calculated using all lowest rank taxa in the input data set, prior to filtering (i.e., sum of trait records/sum of total records for all lowest rank taxa in these higher level groups). If node_all is FALSE, only the lowest rank taxa that remain after filtering (which are the taxa that will be plotted as tree tips) will be used to calculate node enrichment factors. The former method is likely to give a more accurate representation of trait enrichment for higher level taxa, especially if your filtering parameters are removing a significant amount of lowest rank taxa. Note that it may useful to do some data set filtering prior to calling this function. For example, it may be useful to filter out taxa with high collector bias (i.e. high values for max_bias and max_bias_t; see fungarium::enrichment) metrics, prior to using trait_clado, so that these biased taxa are not used in the calculations of enrichment for higher level taxa.

References

  1. Hunter J. Simpson & Jonathan S. Schilling (2021) Using aggregated field collection data and the novel r package fungarium to investigate fungal fire association, Mycologia, 113:4, 842-855, DOI: 10.1080/00275514.2021.1884816

Examples

library(fungarium)
library(ggtree)
library(ggplot2)

#load sample enrichment data set
data(agaricales_enrich)

#filter out taxa with high collector bias (optional)
agaricales_enrich <- agaricales_enrich[agaricales_enrich$max_bias<=0.75,]
agaricales_enrich <- agaricales_enrich[agaricales_enrich$max_bias_t<=0.75,]

#filter out taxa with low total records ("freq")
agaricales_enrich <- agaricales_enrich[agaricales_enrich$freq>=3,]

#make cladogram
trait_clado(data=agaricales_enrich, continuous="color",
            ladderize=TRUE, layout="circular", size=0.8,
            formula = ~new_order/new_family/new_genus/new_species,
            filter="high-300-trait_ratio", node_all = TRUE)+
  geom_tiplab2(color = "black", hjust = 0, offset = 0.1,
               size = 1.4, fontface = "italic") + #add species labels
  geom_tippoint(shape=20,
                aes(color=trait_ratio, size=trait_freq),
                alpha=0.75)+#add tree tips
  scale_color_gradientn(colours= c("cyan", "blue", "purple", "red", "orange"),
                        name = "Fire-associated records enrichment",
                        limits = c(0, round(max(agaricales_enrich$trait_ratio),2)),
                        guide = guide_colourbar(label.vjust = 0.6,
                                                label.theme = element_text(size = 10,
                                                                           colour = "black",
                                                                           angle = 0),
                                                title.position = "top",
                                                nbin=100,
                                                draw.ulim = FALSE,
                                                draw.llim = FALSE,
                                                barwidth = 15,
                                                barheight = 0.5))+
  scale_size(name = "Fire-associated records",
             guide = guide_legend(keywidth = 2,
                                  keyheight = 1,
                                  label.position = "bottom",
                                  label.vjust = 0.6,
                                  label.theme = element_text(size = 10,
                                                             colour = "black",
                                                             angle = 0),
                                  title.position = "top")) +
  theme(plot.margin=margin(0,0,0,0),
        legend.title = element_text(size = 10, margin = margin(0,0,0,0)),
        legend.title.align = 0.5,
        legend.position = "bottom",
        legend.justification = "center",
        legend.margin = margin(0,0,0,0),
        plot.title = element_text(hjust = 0.5, margin=margin(0,0,0,0)))+
  xlim(c(-1, 4))

hjsimpso/fungarium documentation built on Aug. 23, 2023, 3:59 p.m.