trait_clado | R Documentation |
Taxonomic classification data is used to create a cladogram, which is then annotated with trait enrichment factors.
trait_clado(
data,
formula = ~new_kingdom/new_phylum/new_class/new_order/new_family/new_genus/new_species,
node_color = TRUE,
filter = NULL,
node_all = TRUE,
ladderize = TRUE,
continuous = "color",
layout = "circular",
node_calc = "mean",
...
)
data |
Enrichment table from |
formula |
Formula describing how the taxonomic variables (e.g. kingdom, phylum, class, etc.) in your enrichment table should be nested (e.g ~V1/V2/.../Vn). Default: ~new_kingdom/new_phylum/new_class/new_order/new_family/new_genus/new_species. |
node_color |
Logical. If TRUE (the default), nodes are colored based on the values in trait_ratio. |
filter |
Character vector specifying data set filtering parameters. Default is NULL. To select taxa with the highest or lowest values for a certain variable (e.g., trait_ratio, trait_freq, freq) use "high-100-trait_ratio", "high-150-trait_freq", "low-200-freq", etc. To set a value threshold filter (i.e., filter out taxa with less than or more than a certain value for trait_ratio, trait_freq, etc.) use "trait_freq>=5", "freq>20", etc. Full vector example: c("freq>=10","high-150-trait_ratio"). This example will filter out all taxa with less than 10 trait_freq and then select the top 150 taxa with the highest trait_ratio values. Note that inequality filters are used prior to selecting "high" or "low" taxa. |
node_all |
Logical. If TRUE (the default), all taxa in the original input data (before filtering via the |
ladderize |
Logical. Should the cladogram be ladderized? Default is TRUE. See |
continuous |
Logical. Type of node annotation? Default is "color". See |
layout |
Character string specifying the type of tree layout. Default is "circular". See |
node_calc |
Character string defining method for calculating trait enrichment for nodes. If "mean" (the default) is selected, nodes values are calculated based on the mean trait enrichment value for all of the lowest level taxa in that group. If "add" (the default) is selected, nodes values are calculated by dividing the sum of all trait-relevant records by the sum of all records for the lowest level taxa in that group. |
... |
Additional args passed to ggtree. See |
Cladogram produced using ggtree::ggtree
;
thus, all ggtree
arguments are accepted (e.g., geom_tiplab
, geom_tippoint
, etc.).
Additional ggplot layers (e.g., geom_bar
, geom_point
, etc.) can be added to the tree as if it were a ggplot object.
Returns a ggtree object.
Choose filter
and node_all
values carefully. These values control how enrichment
values are displayed for higher level taxa (i.e., nodes). If a filter
condition is used and node_all
is set to TRUE, node enrichment factors will be calculated using all lowest rank taxa in the input data set, prior to filtering
(i.e., sum of trait records/sum of total records for all lowest rank taxa in these higher level groups).
If node_all
is FALSE, only the lowest rank taxa that remain after filtering (which are the taxa that will be plotted as tree tips)
will be used to calculate node enrichment factors. The former method is likely to give a more accurate representation
of trait enrichment for higher level taxa, especially if your filtering parameters are removing a significant amount of lowest rank taxa.
Note that it may useful to do some data set filtering prior to calling this function. For example, it may be useful to
filter out taxa with high collector bias (i.e. high values for max_bias
and max_bias_t
; see fungarium::enrichment
) metrics,
prior to using trait_clado
, so that these biased taxa are not used in the calculations of enrichment for higher level taxa.
Hunter J. Simpson & Jonathan S. Schilling (2021) Using aggregated field collection data and the novel r package fungarium to investigate fungal fire association, Mycologia, 113:4, 842-855, DOI: 10.1080/00275514.2021.1884816
library(fungarium)
library(ggtree)
library(ggplot2)
#load sample enrichment data set
data(agaricales_enrich)
#filter out taxa with high collector bias (optional)
agaricales_enrich <- agaricales_enrich[agaricales_enrich$max_bias<=0.75,]
agaricales_enrich <- agaricales_enrich[agaricales_enrich$max_bias_t<=0.75,]
#filter out taxa with low total records ("freq")
agaricales_enrich <- agaricales_enrich[agaricales_enrich$freq>=3,]
#make cladogram
trait_clado(data=agaricales_enrich, continuous="color",
ladderize=TRUE, layout="circular", size=0.8,
formula = ~new_order/new_family/new_genus/new_species,
filter="high-300-trait_ratio", node_all = TRUE)+
geom_tiplab2(color = "black", hjust = 0, offset = 0.1,
size = 1.4, fontface = "italic") + #add species labels
geom_tippoint(shape=20,
aes(color=trait_ratio, size=trait_freq),
alpha=0.75)+#add tree tips
scale_color_gradientn(colours= c("cyan", "blue", "purple", "red", "orange"),
name = "Fire-associated records enrichment",
limits = c(0, round(max(agaricales_enrich$trait_ratio),2)),
guide = guide_colourbar(label.vjust = 0.6,
label.theme = element_text(size = 10,
colour = "black",
angle = 0),
title.position = "top",
nbin=100,
draw.ulim = FALSE,
draw.llim = FALSE,
barwidth = 15,
barheight = 0.5))+
scale_size(name = "Fire-associated records",
guide = guide_legend(keywidth = 2,
keyheight = 1,
label.position = "bottom",
label.vjust = 0.6,
label.theme = element_text(size = 10,
colour = "black",
angle = 0),
title.position = "top")) +
theme(plot.margin=margin(0,0,0,0),
legend.title = element_text(size = 10, margin = margin(0,0,0,0)),
legend.title.align = 0.5,
legend.position = "bottom",
legend.justification = "center",
legend.margin = margin(0,0,0,0),
plot.title = element_text(hjust = 0.5, margin=margin(0,0,0,0)))+
xlim(c(-1, 4))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.