In TGuillerme/dispRity: Measuring Disparity

dispRity ecology demo

This is an example of typical disparity analysis that can be performed in ecology.

Data

For this example, we will use the famous iris inbuilt data set

data(iris)

This data contains petal and sepal length for 150 individual plants sorted into three species.

## Separating the species
species <- iris[,5]
## Which species?
unique(species)

## Separating the petal/sepal length
measurements <- iris[,1:4]
head(measurements)

We can then ordinate the data using a PCA (prcomp function) thus defining our four dimensional space as the poetically named petal-space.

## Ordinating the data
ordination <- prcomp(measurements)

## The petal-space
petal_space <- ordination$x

## Adding the elements names to the petal-space (the individuals IDs)
rownames(petal_space) <- 1:nrow(petal_space)

Classic analysis

A classical way to represent this ordinated data would be to use two dimensional plots to look at how the different species are distributed in the petal-space.

## Measuring the variance on each axis
axis_variances <- apply(petal_space, 2, var)
axis_variances <- axis_variances/sum(axis_variances)

## Graphical option
par(bty = "n")

## A classic 2D ordination plot
plot(petal_space[, 1], petal_space[, 2], col = species,
    xlab = paste0("PC 1 (", round(axis_variances[1], 2), ")"),
    ylab = paste0("PC 2 (", round(axis_variances[2], 2), ")"))

This shows the distribution of the different species in the petal-space along the two first axis of variation. This is a pretty standard way to visualise the multidimensional space and further analysis might be necessary to test wether the groups are different such as a linear discriminant analysis (LDA). However, in this case we are ignoring the two other dimensions of the ordination! If we look at the two other axis we see a totally different result:

## Plotting the two second axis of the petal-space
plot(petal_space[, 3], petal_space[, 4], col = species,
    xlab = paste0("PC 3 (", round(axis_variances[3], 2), ")"),
    ylab = paste0("PC 4 (", round(axis_variances[4], 2), ")"))

Additionally, these two represented dimensions do not represent a biological reality per se; i.e. the values on the first dimension do not represent a continuous trait (e.g. petal length), instead they just represent the ordinations of correlations between the data and some factors.

Therefore, we might want to approach this problem without getting stuck in only two dimensions and consider the whole dataset as a n-dimensional object.

A multidimensional approach with `dispRity`

The first step is to create different subsets that represent subsets of the ordinated space (i.e. sub-regions within the n-dimensional object). Each of these subsets will contain only the individuals of a specific species.

## Creating the table that contain the elements and their attributes
petal_subsets <- custom.subsets(petal_space, group = list(
                                "setosa" = which(species == "setosa"),
                                "versicolor" = which(species == "versicolor"),
                                "virginica" = which(species == "virginica")))

## Visualising the dispRity object content
petal_subsets

This created a dispRity object (more about that here) with three subsets corresponding to each subspecies.

Bootstrapping the data

We can the bootstrap the subsets to be able test the robustness of the measured disparity to outliers. We can do that using the default options of boot.matrix (more about that here):

## Bootstrapping the data
(petal_bootstrapped <- boot.matrix(petal_subsets))

Calculating disparity

Disparity can be calculated in many ways, therefore the dispRity function allows users to define their own measure of disparity. For more details on measuring disparity, see the dispRity metrics section.

In this example, we are going to define disparity as the median distance between the different individuals and the centroid of the ordinated space. High values of disparity will indicate a generally high spread of points from this centroid (i.e. on average, the individuals are far apart in the ordinated space). We can define the metrics easily in the dispRity function by feeding them to the metric argument. Here we are going to feed the functions stats::median and dispRity::centroids which calculates distances between elements and their centroid.

## Calculating disparity as the median distance between each elements and
## the centroid of the petal-space
(petal_disparity <- dispRity(petal_bootstrapped, metric = c(median, centroids)))

Summarising the results (plot)

Similarly to the custom.subsets and boot.matrix function, dispRity displays a dispRity object. But we are definitely more interested in actually look at the calculated values.

First we can summarise the data in a table by simply using summary:

## Displaying the summary of the calculated disparity
summary(petal_disparity)

We can also plot the results in a similar way:

## Graphical options
par(bty = "n")

## Plotting the disparity in the petal_space
plot(petal_disparity)

Now contrary to simply plotting the two first axis of the PCA where we saw that the species have a different position in the two first petal-space, we can now also see that they occupy this space clearly differently!

Testing hypothesis

Finally we can test our hypothesis that we guessed from the disparity plot (that some groups occupy different volume of the petal-space) by using the test.dispRity option.

## Running a PERMANOVA
test.dispRity(petal_disparity, test = adonis.dispRity)

## Post-hoc testing of the differences between species (corrected for multiple tests)
test.dispRity(petal_disparity, test = t.test, correction = "bonferroni")

We can now see that there is a significant difference in petal-space occupancy between all species of iris.

Setting up a multidimensional null-hypothesis

One other series of test can be done on the shape of the petal-space. Using a MCMC permutation test we can simulate a petal-space with specific properties and see if our observed petal-space matches these properties (similarly to @diaz2016global):

## Testing against a uniform distribution
disparity_uniform <- null.test(petal_disparity, replicates = 200,
    null.distrib = runif, scale = FALSE)
plot(disparity_uniform)

## Testing against a normal distribution
disparity_normal <- null.test(petal_disparity, replicates = 200,
    null.distrib = rnorm, scale = TRUE)
plot(disparity_normal)

In both cases we can see that our petal-space is not entirely normal or uniform. This is expected because of the simplicity of these parameters.

TGuillerme/dispRity documentation built on June 10, 2025, 9:35 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com