This is an example of typical disparity analysis that can be performed in ecology.
For this example, we will use the famous iris
inbuilt data set
data(iris)
This data contains petal and sepal length for 150 individual plants sorted into three species.
## Separating the species species <- iris[,5] ## Which species? unique(species) ## Separating the petal/sepal length measurements <- iris[,1:4] head(measurements)
We can then ordinate the data using a PCA (prcomp
function) thus defining our four dimensional space as the poetically named petal-space.
## Ordinating the data ordination <- prcomp(measurements) ## The petal-space petal_space <- ordination$x ## Adding the elements names to the petal-space (the individuals IDs) rownames(petal_space) <- 1:nrow(petal_space)
A classical way to represent this ordinated data would be to use two dimensional plots to look at how the different species are distributed in the petal-space.
## Measuring the variance on each axis axis_variances <- apply(petal_space, 2, var) axis_variances <- axis_variances/sum(axis_variances) ## Graphical option par(bty = "n") ## A classic 2D ordination plot plot(petal_space[, 1], petal_space[, 2], col = species, xlab = paste0("PC 1 (", round(axis_variances[1], 2), ")"), ylab = paste0("PC 2 (", round(axis_variances[2], 2), ")"))
This shows the distribution of the different species in the petal-space along the two first axis of variation. This is a pretty standard way to visualise the multidimensional space and further analysis might be necessary to test wether the groups are different such as a linear discriminant analysis (LDA). However, in this case we are ignoring the two other dimensions of the ordination! If we look at the two other axis we see a totally different result:
## Plotting the two second axis of the petal-space plot(petal_space[, 3], petal_space[, 4], col = species, xlab = paste0("PC 3 (", round(axis_variances[3], 2), ")"), ylab = paste0("PC 4 (", round(axis_variances[4], 2), ")"))
Additionally, these two represented dimensions do not represent a biological reality per se; i.e. the values on the first dimension do not represent a continuous trait (e.g. petal length), instead they just represent the ordinations of correlations between the data and some factors.
Therefore, we might want to approach this problem without getting stuck in only two dimensions and consider the whole dataset as a n-dimensional object.
dispRity
The first step is to create different subsets that represent subsets of the ordinated space (i.e. sub-regions within the n-dimensional object). Each of these subsets will contain only the individuals of a specific species.
## Creating the table that contain the elements and their attributes petal_subsets <- custom.subsets(petal_space, group = list( "setosa" = which(species == "setosa"), "versicolor" = which(species == "versicolor"), "virginica" = which(species == "virginica"))) ## Visualising the dispRity object content petal_subsets
This created a dispRity
object (more about that here) with three subsets corresponding to each subspecies.
We can the bootstrap the subsets to be able test the robustness of the measured disparity to outliers.
We can do that using the default options of boot.matrix
(more about that here):
## Bootstrapping the data (petal_bootstrapped <- boot.matrix(petal_subsets))
Disparity can be calculated in many ways, therefore the dispRity
function allows users to define their own measure of disparity.
For more details on measuring disparity, see the dispRity metrics section.
In this example, we are going to define disparity as the median distance between the different individuals and the centroid of the ordinated space.
High values of disparity will indicate a generally high spread of points from this centroid (i.e. on average, the individuals are far apart in the ordinated space).
We can define the metrics easily in the dispRity
function by feeding them to the metric
argument.
Here we are going to feed the functions stats::median
and dispRity::centroids
which calculates distances between elements and their centroid.
## Calculating disparity as the median distance between each elements and ## the centroid of the petal-space (petal_disparity <- dispRity(petal_bootstrapped, metric = c(median, centroids)))
Similarly to the custom.subsets
and boot.matrix
function, dispRity
displays a dispRity
object.
But we are definitely more interested in actually look at the calculated values.
First we can summarise the data in a table by simply using summary
:
## Displaying the summary of the calculated disparity summary(petal_disparity)
We can also plot the results in a similar way:
## Graphical options par(bty = "n") ## Plotting the disparity in the petal_space plot(petal_disparity)
Now contrary to simply plotting the two first axis of the PCA where we saw that the species have a different position in the two first petal-space, we can now also see that they occupy this space clearly differently!
Finally we can test our hypothesis that we guessed from the disparity plot (that some groups occupy different volume of the petal-space) by using the test.dispRity
option.
## Running a PERMANOVA test.dispRity(petal_disparity, test = adonis.dispRity) ## Post-hoc testing of the differences between species (corrected for multiple tests) test.dispRity(petal_disparity, test = t.test, correction = "bonferroni")
We can now see that there is a significant difference in petal-space occupancy between all species of iris.
One other series of test can be done on the shape of the petal-space. Using a MCMC permutation test we can simulate a petal-space with specific properties and see if our observed petal-space matches these properties (similarly to @diaz2016global):
## Testing against a uniform distribution disparity_uniform <- null.test(petal_disparity, replicates = 200, null.distrib = runif, scale = FALSE) plot(disparity_uniform)
## Testing against a normal distribution disparity_normal <- null.test(petal_disparity, replicates = 200, null.distrib = rnorm, scale = TRUE) plot(disparity_normal)
In both cases we can see that our petal-space is not entirely normal or uniform. This is expected because of the simplicity of these parameters.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.