knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) library(taxodist)
The taxodist distance matrix is a standard dist object and integrates
directly with the R ecosystem for multivariate analysis. This vignette shows
how to use it alongside vegan and ape for ecological and phylogenetic
analyses. All examples use a set of threatened mammals from Brazil spanning
five broad taxonomic groups.
taxa_brazil <- c( "Priodontes", "Myrmecophaga", "Chrysocyon", "Tapirus", "Didelphis", "Leontopithecus", "Brachyteles", "Panthera", "Pteronura", "Puma", "Sotalia", "Pontoporia", "Trichechus", "Mazama", "Blastocerus" )
The dataset covers xenarthrans (Priodontes, Myrmecophaga), a marsupial
(Didelphis), ungulates (Tapirus, Mazama, Blastocerus), primates
(Leontopithecus, Brachyteles), carnivores (Chrysocyon, Panthera,
Pteronura, Puma), cetaceans (Sotalia, Pontoporia), and a sirenian
(Trichechus).
The pairwise distance matrix is computed with a single call. Lineages are cached after the first retrieval, so subsequent analyses on the same taxa are instantaneous.
library(taxodist) mat <- distance_matrix(taxa_brazil) print(mat) #> Priodontes Myrmecophaga Chrysocyon Tapirus Didelphis #> Myrmecophaga 0.01639344 #> Chrysocyon 0.01694915 0.01694915 #> Tapirus 0.01694915 0.01694915 0.01587302 #> Didelphis 0.01754386 0.01754386 0.01754386 0.01754386 #> Leontopithecus 0.01694915 0.01694915 0.01666667 0.01666667 0.01754386 #> Brachyteles 0.01694915 0.01694915 0.01666667 0.01666667 0.01754386 #> Panthera 0.01694915 0.01694915 0.01492537 0.01587302 0.01754386 #> Pteronura 0.01694915 0.01694915 0.01470588 0.01587302 0.01754386 #> Puma 0.01694915 0.01694915 0.01492537 0.01587302 0.01754386 #> Sotalia 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386 #> Pontoporia 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386 #> Trichechus 0.01666667 0.01666667 0.01694915 0.01694915 0.01754386 #> Mazama 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386 #> Blastocerus 0.01694915 0.01694915 0.01587302 0.01562500 0.01754386
Didelphis shows the largest distances to all other taxa (0.01754),
consistent with the early divergence of marsupials from placental mammals.
Carnivores (Panthera, Pteronura, Puma) show smaller distances among
themselves, reflecting a more recent common ancestor within Carnivora.
apetaxo_cluster() performs hierarchical clustering on the distance matrix and
returns an object that ape can convert directly to a phylo tree for
publication-quality visualisation.
library(ape) cl <- taxo_cluster(taxa_brazil) tree <- ape::as.phylo(cl$hclust) plot(tree, main = "Threatened Mammals of Brazil", cex = 0.85, tip.color = "gray20")
The phylo object is fully compatible with all ape tree manipulation and
plotting functions, including nodelabels(), edgelabels(), and export to
Newick format via write.tree().
ape::write.tree(tree, file = "taxa_brazil.nwk")
vegantaxondive)vegan::taxondive() computes a family of taxonomic diversity indices
(Clarke and Warwick 1998, 2001) from a community matrix and a taxonomic
distance matrix. These indices capture not just species richness but how
distantly related the species in a community are: a community of closely
related species scores lower than one with the same richness but spanning
multiple orders.
The function expects a community matrix (sites x species) and the dist
object from distance_matrix(). Here we define three hypothetical
communities, each dominated by one broad taxonomic group.
set.seed(123) library(vegan) comm <- matrix(c( 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, # xenarthrans + marsupial 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, # primates + carnivores 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1 # cetaceans + sirenian + ungulates ), nrow = 3, byrow = TRUE) colnames(comm) <- taxa_brazil rownames(comm) <- c("community_A", "community_B", "community_C") vegan::taxondive(comm, mat) #> Species Delta Delta* Lambda+ Delta+ S Delta+ #> community_A 3.0000e+00 1.7160e-02 1.7160e-02 2.9410e-07 1.7160e-02 0.0515 #> community_B 5.0000e+00 1.5905e-02 1.5905e-02 8.8325e-07 1.5905e-02 0.0795 #> community_C 5.0000e+00 1.5750e-02 1.5750e-02 1.1834e-06 1.5750e-02 0.0788 #> Expected -9.9265e-02 1.6544e-02 1.6457e-02
The key indices are:
Delta ($\Delta$): mean taxonomic distance between all pairs of species in the community.
Delta ($\Delta^$): mean pairwise distance excluding self-comparisons; comparable across communities of different sizes.
Delta+ ($\Delta^+$): mean pairwise distance weighted by species abundances. For presence-absence data this equals $\Delta^*$.
Lambda+ ($\Lambda^+$): variance in taxonomic distances, measuring how evenly a community spans the taxonomic tree.
community_A, dominated by xenarthrans and a marsupial, scores the highest
$\Delta$ (0.01673), reflecting deeper divergences among its members.
community_B (primates and carnivores) scores the lowest (0.01591), as its
members share more recent common ancestors within Placentalia.
veganThe Mantel test assesses the correlation between two distance matrices by permutation. A typical application is testing whether taxonomic distance correlates with geographic distance, a signature of phylogeographic structure.
Here we use simulated coordinates for illustration. In practice, replace
coords with observed latitude and longitude values.
set.seed(42) coords <- matrix(rnorm(30), ncol = 2) rownames(coords) <- taxa_brazil geo_dist <- dist(coords) vegan::mantel(mat, geo_dist) #> Mantel statistic based on Pearson's product-moment correlation #> #> Call: #> vegan::mantel(xdis = mat, ydis = geo_dist) #> #> Mantel statistic r: -0.0569 #> Significance: 0.653 #> #> Upper quantiles of permutations (null model): #> 90% 95% 97.5% 99% #> 0.188 0.237 0.272 0.336 #> Permutation: free #> Number of permutations: 999
The non-significant result (r = -0.27, p = 0.972) is expected with random
coordinates. For non-normal distance distributions, "spearman" is more
robust than the default "pearson".
vegan::mantel(mat, geo_dist, method = "spearman", permutations = 9999) #> Mantel statistic based on Spearman's rank correlation rho #> #> Call: #> vegan::mantel(xdis = mat, ydis = geo_dist, method = "spearman", #> permutations = 9999) #> #> Mantel statistic r: -0.07405 #> Significance: 0.672 #> #> Upper quantiles of permutations (null model): #> 90% 95% 97.5% 99% #> 0.189 0.244 0.293 0.353 #> Permutation: free #> Number of permutations: 9999
veganPERMANOVA (Anderson 2001) tests whether groups of taxa differ significantly in taxonomic distance space. It partitions total variance in the distance matrix into within-group and between-group components and assesses significance by permutation, with no assumption of multivariate normality.
groups <- c( "xenarthra", "xenarthra", "carnivore", "ungulate", "marsupial", "primate", "primate", "carnivore", "carnivore", "carnivore", "cetacean", "cetacean", "sirenian", "ungulate", "ungulate" ) vegan::adonis2(mat ~ groups) #> Permutation test for adonis under reduced model #> Permutation: free #> Number of permutations: 999 #> #> vegan::adonis2(formula = mat ~ groups) #> Df SumOfSqs R2 F Pr(>F) #> Model 6 0.00100049 0.52646 1.4823 0.001 *** #> Residual 8 0.00089992 0.47354 #> Total 14 0.00190041 1.00000 #> --- #> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The taxonomic grouping explains $49.6\%$ of the total variance in the distance matrix ($\text{R}^2 = 0.496$, $\text{p} = 0.001$). This confirms that the broad mammalian orders occupy distinct regions of taxonomic distance space, consistent with their deep evolutionary divergences.
For pairwise comparisons between groups after a significant PERMANOVA,
see pairwise.adonis2() in the pairwiseAdonis package.
The following script chains all steps from taxon names to multivariate analysis.
library(taxodist) library(vegan) library(ape) # 1. define taxa taxa_brazil <- c( "Priodontes", "Myrmecophaga", "Chrysocyon", "Tapirus", "Didelphis", "Leontopithecus", "Brachyteles", "Panthera", "Pteronura", "Puma", "Sotalia", "Pontoporia", "Trichechus", "Mazama", "Blastocerus" ) # 2. compute distance matrix mat <- distance_matrix(taxa_brazil) # 3. hierarchical clustering and phylo plot cl <- taxo_cluster(taxa_brazil) tree <- ape::as.phylo(cl$hclust) plot(tree, main = "Threatened Mammals of Brazil", cex = 0.85) # 4. PCoA ordination ord <- taxo_ordinate(mat) plot(ord, main = "PCoA: Taxonomic Distance Space") # 5. PERMANOVA groups <- c( "xenarthra", "xenarthra", "carnivore", "ungulate", "marsupial", "primate", "primate", "carnivore", "carnivore", "carnivore", "cetacean", "cetacean", "sirenian", "ungulate", "ungulate" ) vegan::adonis2(mat ~ groups)
Anderson, M.J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26, 32--46.
Clarke, K.R. and Warwick, R.M. (1998). A taxonomic distinctness index and its statistical properties. Journal of Applied Ecology, 35, 523--531.
Clarke, K.R. and Warwick, R.M. (2001). A further biodiversity index applicable to species lists: variation in taxonomic distinctness. Marine Ecology Progress Series, 216, 265--278.
Brands, S.J. (1989 onwards). Systema Naturae 2000. Amsterdam, The Netherlands. Retrieved from The Taxonomicon, http://taxonomicon.taxonomy.nl.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.