# dispRity.metric: Disparity metrics In dispRity: Measuring Disparity

 dispRity.metric R Documentation

## Disparity metrics

### Description

Different implemented disparity metrics.

### Usage

``````dimension.level3.fun(matrix, ...)
dimension.level2.fun(matrix, ...)
dimension.level1.fun(matrix, ...)
between.groups.fun(matrix, matrix2, ...)
``````

### Arguments

 `matrix` A matrix. `...` Optional arguments to be passed to the function. Usual optional arguments are `method` for specifying the method for calculating distance passed to `vegdist` (e.g. `method = "euclidean"` - default - or `method = "manhattan"`) or `k.root` to scale the result using the `kth` root. See details below for available optional arguments for each function. `matrix2` Optional, a second matrix for metrics between groups.

### Details

These are inbuilt functions for calculating disparity. See `make.metric` for details on `dimension.level3.fun`, `dimension.level2.fun`, `dimension.level1.fun` and `between.groups.fun`. The dimensions levels (1, 2 and 3) can be seen as similar to ranks in linear algebra.

The currently implemented dimension-level 1 metrics are:

• `convhull.volume`: calculates the convex hull hypervolume of a matrix (calls `convhulln(x, options = "FA")\$vol`).

• Both `convhull` functions call the `convhulln` function with the `"FA"` option (computes total area and volume).

• WARNING: both `convhull` functions can be computationally intensive above 10 dimensions!

• `convhull.surface`: calculates the convex hull hypersurface of a matrix (calls `convhulln(x, options = "FA")\$area`).

• `diagonal`: calculates the longest distance in the ordinated space.

• WARNING: This function is the generalisation of Pythagoras' theorem and thus works only if each dimensions are orthogonal to each other.

• `ellipsoid.volume`: calculates the ellipsoid volume of a matrix. This function tries to determine the nature of the input matrix and uses one of these following methods to calculate the volume. You can always specify the method using `method = "my_choice"` to overrun the automatic method choice.

• `"eigen"`: this method directly calculates the eigen values from the input matrix (using `eigen`). This method is automatically selected if the input matrix is "distance like" (i.e. square with two mirrored triangles and a diagonal).

• `"pca"`: this method calculates the eigen values as the sum of the variances of the matrix (`abs(apply(var(matrix),2, sum))`). This is automatically selected if the input matrix is NOT "distance like". Note that this method is faster than `"eigen"` but only works if the input matrix is an ordinated matrix from a PCA, PCO, PCoA, NMDS or MDS.

• `"axes"`: this method calculates the actual semi axes length using the input matrix. It is never automatically selected. By default this method calculates the length of the major axes based on the 0.95 confidence interval ellipse but this can be modified by providing additional arguments from `axis.covar`.

• `<a numeric vector>`: finally, you can directly provide a numeric vector of eigen values. This method is never automatically selected and overrides any other options.

• `func.div`: The functional divergence (Villeger et al. 2008): the ratio of deviation from the centroid (this is similar to `FD::dbFD()\$FDiv`).

• `func.eve`: The functional evenness (Villeger et al. 2008): the minimal spanning tree distances evenness (this is similar to `FD::dbFD()\$FEve`). If the matrix used is not a distance matrix, the distance method can be passed using, for example `method = "euclidean"` (default).

• `mode.val`: calculates the modal value of a vector.

• `n.ball.volume`: calculate the volume of the minimum n-ball (if `sphere = TRUE`) or of the ellipsoid (if `sphere = FALSE`).

• `roundness`: calculate the roundness of an elliptical representation of a variance-covariance matrix as the integral of the ranked distribution of the major axes. A value of 1 indicates a sphere, a value between 1 and 0.5 indicates a more pancake like representation and a value between 0.5 and 0 a more cigar like representation. You can force the variance-covariance calculation by using the option `vcv = TRUE` (default) that will calculate the variance-covariance matrix if the input is not one.

See also `mean`, `median`, `sum` or `prod` for commonly used summary metrics.

The currently implemented dimension-level 2 metrics are:

• `ancestral.dist`: calculates the distance between each elements coordinates in the matrix and their ancestors' coordinates (if `to.root = FALSE`; default) or to the root coordinates (if `to.root = TRUE`) for a given `tree`. The distance is calculate as Euclidean by default but can be changed through the `methods` argument (`method = "euclidean"`; default). Note that the matrix must contain data for both tips and nodes in the `tree`, otherwise you must provide a matrix to the argument `reference.data` that contains them. Note that if the function is used in `dispRity`, both the `tree` and `reference.data` can be automatically recycled from the `dispRity` object (if present).

• `angles`: calculates the angles of the main axis of variation per dimension in a `matrix`. The angles are calculated using the least square algorithm from the `lm` function. The unit of the angle can be changed through the `unit` argument (either `"degree"` (default), `radian` or `slope`) and a base angle to measure the angle from can be passed through the `base` argument (by default `base = 0`, measuring the angle from the horizontal line (note that the `base` argument has to be passed in the same unit as `unit`). When estimating the slope through `lm`, you can use the option `significant` to only consider significant slopes (`TRUE`) or not (`FALSE` - default).

• `centroids`: calculates the distance between each row and the centroid of the matrix (Laliberte 2010). This function can take an optional arguments `centroid` for defining the centroid (if missing (default), the centroid of the matrix is used). This argument can be either a subset of coordinates matching the matrix's dimensions (e.g. `c(0, 1, 2)` for a matrix with three columns) or a single value to be the coordinates of the centroid (e.g. `centroid = 0` will set the centroid coordinates to `c(0, 0, 0)` for a three dimensional matrix). NOTE: distance is calculated as `"euclidean"` by default, this can be changed using the `method` argument.

• `deviations`: calculates the minimal Euclidean distance between each element in and the hyperplane (or line if 2D, or a plane if 3D). You can specify equation of hyperplane of d dimensions in the `intercept + ax + by + ... + nd = 0` format. For example the line `y = 3x + 1` should be entered as `c(1, 3, -1)` or the plane `x + 2y - 3z = 44` as `c(44, 1, 2, -3)`. If missing the `hyperplane` (default) is calculated using a least square regression using a gaussian `glm`. Extra arguments can be passed to `glm` through `...`. When estimating the hyperplane, you can use the option `significant` to only consider significant slopes (`TRUE`) or not (`FALSE` - default).

• `displacements`: calculates the ratio between the distance to the centroid (see `centroids` above) and the distance from a reference (by default the origin of the space). The reference can be changed through the `reference` argument. NOTE: distance is calculated as `"euclidean"` by default, this can be changed using the `method` argument.

• `edge.length.tree`: calculates the edge length from a given tree for each elements present in the matrix. Each edge length is either measured between the element and the root of the tree (`to.root = TRUE` ; default) or between the element and its last ancestor (`to.root = FALSE`))

• `neighbours`: calculates the distance to a neighbour (Foote 1990). By default this is the distance to the nearest neighbour (`which = min`) but can be set to any dimension level - 1 function (e.g. `which = mean` gives the distance to the most average neighbour). NOTE: distance is calculated as `"euclidean"` by default, this can be changed using the `method` argument.

• `pairwise.dist`: calculates the pairwise distance between elements - calls `vegdist(matrix, method = method, diag = FALSE, upper = FALSE, ...)`. The distance type can be changed via the `method` argument (see `vegdist` - default: `method = "euclidean"`). This function outputs a vector of pairwise comparisons in the following order: d(A,B), d(A,C), d(B,C) for three elements A, B and C. NOTE: distance is calculated as `"euclidean"` by default, this can be changed using the `method` argument.

• `projections`: projects each element on a vector defined as (`point1`, `point2`) and measures some aspect of this projection. The different aspects that can be measured are:

• `measure = "position"` (default), the distance of each element on the vector (`point1`, `point2`). Negative values means the element projects on the opposite direction of the vector (`point1`, `point2`).

• `measure = "distance"`, the euclidean distance of each element from the vector (`point1`, `point2`).

• `measure = "degree"`, the angle between the vector (`point1`, `point2`) and any vector (`point1`, `element`) in degrees.

• `measure = "radian"`, the angle between the vector (`point1`, `point2`) and any vector (`point1`, `element`) in radians.

• `measure = "orthogonality"`, the angle between the vector (`point1`, `point2`) and any vector (`point1`, `element`) expressed in right angle ranging between 0 (non angle) and 1 (right angle).

By default, `point1` is the centre of the space (coordinates `0, 0, 0, ...`) and `point2` is the centroid of the space (coordinates `colMeans(matrix)`). Coordinates for `point1` and `point2` can be given as a single value to be repeated (e.g. `point1 = 1` is translated into `point1 = c(1, 1, ...)`) or a specific set of coordinates. Furthermore, by default, the space is scaled so that the vector (`point1`, `point2`) becomes the unit vector (distance (`point1`, `point2`) is set to 1; option `scale = TRUE`; default). You can use the unit vector of the space using the option `scale = FALSE`. Other options include the centering of the projections on 0.5 (`centre = TRUE`; default is set to `FALSE`) ranging the projection onto the vector (`point1`, `point2`) between -1 and 1 (higher or lower values project beyond the vector); and whether to output the projection values as absolute values (`abs = FALSE`; default is set to `FALSE`). These two last options only affect the results from `measure = "position"`.

• `projections.tree`: calculates the `projections` metric but drawing the vectors from a phylogenetic tree. This metric can intake any argument from `projections` (see above) but for `point1` and `point2` that are replaced by the argument `type`. `type` is a `vector` or a `list` of two elements that designates which vector to draw and can be any pair of the following options (the first element being the origin of the vector and the second where the vector points to):

• `"root"`: the root of the tree (the first element in tree\$node.label);

• `"ancestor"`: the element's most recent ancestor;

• `"tips"`: the centroid of the tips;

• `"nodes"`: the centroid of all the nodes;

• `"livings"`: the centroid of the tips the furthest from the root;

• `"fossils"`: the centroid of all the tips that are not the furthest from the root;

• any numeric values that can be interpreted as `point1` and `point2` in `projections`;

• or a user defined function that with the inputs `matrix` and `tree` and `row` (the element's ID, i.e. the row number in `matrix`).

NOTE: the elements to calculate the origin and end points of the vector are calculated by default on the provided input `matrix` which can be missing data from the tree if used with `custom.subsets` or `chrono.subsets`. You can always provide the full matrix using the option `reference.data = my_matrix`. Additional arguments include any arguments to be passed to `projections` (e.g. `centre` or `abs`).

• `quantiles`: calculates the quantile range of each axis of the matrix. The quantile can be changed using the `quantile` argument (default is `quantile = 95`, i.e. calculating the range on each axis that includes 95% of the data). An optional argument, `k.root`, can be set to `TRUE` to scale the ranges by using its `kth` root (where `k` are the number of dimensions). By default, `k.root = FALSE`.

• `radius`: calculates a distance from the centre of each axis. The `type` argument is the function to select which distance to calculate. By default `type = max` calculates the maximum distance between the elements and the centre for each axis (i.e. the radius for each dimensions)

• `ranges`: calculates the range of each axis of the matrix (Wills 2001). An optional argument, `k.root`, can be set to `TRUE` to scale the ranges by using its `kth` root (where `k` are the number of dimensions). By default, `k.root = FALSE`.

• `variances`: calculates the variance of each axis of the matrix (Wills 2001). This function can also take the `k.root` optional argument described above.

• `span.tree.length`: calculates the length of the minimum spanning tree (see `spantree`). This function can get slow with big matrices. To speed it up, one can directly use distance matrices as the multidimensional space.

The currently implemented between.groups metrics are:

• `disalignment`: calculates the rejection of a point from `matrix` from the major axis of `matrix2`. Options are, `axis` to choose which major axis to reject from (default is `axis = 1`); `level` for the ellipse' confidence interval (to calculate the axis) (default is `level = 0.95`) and `point.to.reject`, a numeric value for designating which point in `matrix` to use or a function for calculating it (default is `point.to.reject = colMeans` for `matrix`'s centroid).

• `group.dist`: calculates the distance between two groups (by default, this is the minimum euclidean vector norm distance between groups). Negative distances are considered as 0. This function must intake two matrices (`matrix` and `matrix2`) and the quantiles to consider. For the minimum distance between two groups, the 100th quantiles are considered (default: `probs = c(0,1)`) but this can be changed to any values (e.g. distance between the two groups accounting based on the 95th CI: `probs = c(0.025, 0.975)`; distance between centroids: `probs = c(0.5)`, etc...). This function is the linear algebra equivalent of the `hypervolume::hypervolume_distance` function.

• `point.dist`: calculates the distance between `matrix` and a point calculated from `matrix2`. By default, this point is the centroid of `matrix2`. This can be changed by passing a function to be applied to `matrix2` through the `point` argument (for example, for the centroid: `point.dist(..., point = colMeans)`). NOTE: distance is calculated as `"euclidean"` by default, this can be changed using the `method` argument.

• `projections.between`: calculates the projection of the major axis between two matrices. It allows the same arguments as `projections`. This function measures the major axis from both input matrices, centre their origins and projects the end of the vector of `matrix` onto the vector from `matrix2`. Which axis to measure can be changed with the option `axis` (for the major axis, `axis = 1`; default) and the confidence interval can be changed using `level` (for the 95 confidence interval, `level = 0.95`; default - see `axis.covar` for more details).

When used in the `dispRity` function, optional arguments are declared after the `metric` argument: for example `dispRity(data, metric = centroids, centroid = 0, method = "manhattan")`

Thomas Guillerme

### References

Donohue I, Petchey OL, Montoya JM, Jackson AL, McNally L, Viana M, Healy K, Lurgi M, O'Connor NE, Emmerson MC. 2013. On the dimensionality of ecological stability. Ecology letters. 16(4):421-9.

Lalibert'e E, Legendre P. 2010. A distance-based framework for measuring functional diversity from multiple traits. Ecology, 91(1), pp.299-305.

Vill'eger S, Mason NW, Mouillot D. 2008. New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology. 89(8):2290-301.

Wills MA. 2001. Morphological disparity: a primer. In Fossils, phylogeny, and form (pp. 55-144). Springer, Boston, MA.

Foote, M. 1990. Nearest-neighbor analysis of trilobite morphospace. Systematic Zoology, 39(4), pp.371-382.

`dispRity` and `make.metric`.

### Examples

``````## A random matrix
dummy_matrix <- matrix(rnorm(90), 9, 10)

## ancestral.dist
## A random tree with node labels
rand_tree <- rtree(5) ; rand_tree\$node.label <- paste0("n", 1:4)
## Adding the tip and node names to the matris
rownames(dummy_matrix) <- c(rand_tree\$tip.label, rand_tree\$node.label)
## Calculating the distances to the ancestors
ancestral.dist(dummy_matrix, tree = rand_tree)
## Calculating the manhattan distances to the root
ancestral.dist(dummy_matrix, tree = rand_tree,
to.root = TRUE, method = "manhattan")

## angles
## The angles in degrees of each axis
angles(dummy_matrix)
## The angles in slope from the 1:1 slope (Beta = 1)
angles(dummy_matrix, unit = "slope", base = 1)

## centroids
## Distances between each row and centroid of the matrix
centroids(dummy_matrix)
## Distances between each row and an arbitrary point
centroids(dummy_matrix, centroid = c(1,2,3,4,5,6,7,8,9,10))
## Distances between each row and the origin
centroids(dummy_matrix, centroid = 0)

## convhull.surface
## Making a matrix with more elements than dimensions (for convhull)
thinner_matrix <- matrix(rnorm(90), 18, 5)
## Convex hull hypersurface of a matrix
convhull.surface(thinner_matrix)

## convhull.volume
## Convex hull volume of a matrix
convhull.volume(thinner_matrix)

## deviations
## The deviations from the least square hyperplane
deviations(dummy_matrix)
## The deviations from the plane between the x and y axis
deviations(dummy_matrix, hyperplane = c(0,1,1,0,0,0,0,0,0,0,0))

## diagonal
## Matrix diagonal
diagonal(dummy_matrix) # WARNING: only valid if the dimensions are orthogonal

## disalignment
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Measuring the disalignment of matrix_1 from matrix_2
disalignment(matrix_1, matrix_2)
## Same but using the 2nd major axis of the 0.75 CI ellipse
## from matrix_2 and the first point from matrix_1.
disalignment(matrix_1, matrix_2,
axis = 2, level = 0.75,
point.to.reject = 1)

## displacements
## displacement ratios (from the centre)
displacements(dummy_matrix)
## displacement ratios (from an arbitrary point)
displacements(dummy_matrix, reference = c(1,2,3,4,5,6,7,8,9,10))
## displacement ratios from the centre (manhattan distance)
displacements(dummy_matrix, method = "manhattan")

## edge.length.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree\$tip.label,
dummy_tree\$node.label)
## The total edge length of each element in the matrix (to the root)
edge.length.tree(named_matrix, tree = dummy_tree)

## The edge lengths for each edge leading to the elements in the matrix
edge.length.tree(named_matrix, tree = dummy_tree, to.root = FALSE)

## ellipsoid.volume
## Ellipsoid volume of a matrix
ellipsoid.volume(dummy_matrix)
## Calculating the same volume with provided eigen values
ordination <- prcomp(dummy_matrix)
## Calculating the ellipsoid volume by providing your own eigen values
ellipsoid.volume(ordination\$x, method = ordination\$sdev^2)

## func.div
## Functional divergence
func.div(dummy_matrix)

## func.eve
## Functional evenness
func.eve(dummy_matrix)
## Functional evenness (based on manhattan distances)
func.eve(dummy_matrix, method = "manhattan")

## group.dist
## The distance between groups
dummy_matrix2 <- matrix(runif(40, min = 2, max = 4), 4, 10)
## The minimum distance between both groups
group.dist(dummy_matrix, dummy_matrix2)
## The distance between both groups' centroids
group.dist(dummy_matrix, dummy_matrix2, probs = 0.5)
## The minimum distance between the 50% CI of each group
group.dist(dummy_matrix, dummy_matrix2, probs = c(0.25, 0.75))

## mode.val
## Modal value of a vector
mode.val(dummy_matrix)

## neighbours
## The nearest neighbour euclidean distances
neighbours(dummy_matrix)
## The furthest neighbour manhattan distances
neighbours(dummy_matrix, which = max, method = "manhattan")

## pairwise.dist
## The pairwise distance
pairwise.dist(dummy_matrix)
## The average squared pairwise distance
mean(pairwise.dist(dummy_matrix)^2)
## equal to:
# geiger::disparity(data = dummy_matrix)

## point.dist
## The distances from the rows dummy_matrix
## to the centroids of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2)
## The average distances from dummy_matrix
## to the centroids of dummy_matrix2
mean(point.dist(dummy_matrix, dummy_matrix2))
## The manhattan distance from the rows dummy_matrix
## to the standard deviation of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2, point = sd, method = "manhattan")

## projections
## The distances on the vector defined from the centre of
## the matrix to its centroid (default)
projections(dummy_matrix)
## The distances from the vector defined from the third
## element of the matrix to the point of coordinated
## c(1,1,1, ...) the matrix to its centroid (default)
projections(dummy_matrix, measure = "distance",
point1 = dummy_matrix[3, ],
point2 = 1)

## projections.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree\$tip.label,
dummy_tree\$node.label)
## The projection on the vector defined from the root of
## the tree to the ancestor of each element in the matrix
projections.tree(named_matrix, dummy_tree,
type = c("root", "ancestor"))
## The rejection from the vector defined from the centroid
## of the nodes to the centroids of the tips
projections.tree(named_matrix, dummy_tree,
type = c("nodes", "tips"),
measure = "distance")
## A user function that define coordinates based on the
## centroid of the three first nodes
user.fun <- function(matrix, tree, row = NULL) {
return(colMeans(matrix[tree\$node.label[1:3], ]))
}
## The projection on the vector defined by the coordinates
## 0,0,0 and a user defined function
projections.tree(named_matrix, dummy_tree,
type = c(0, user.fun))

## projections.between
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Projecting the major axis of matrix_2 onto the one from matrix_1
projections.between(matrix_1, matrix_2)
## Projecting both second major 0.75 axes
## and getting the rejections (see projections() for option details)
projections.between(matrix_1, matrix_2,
measure = "distance",
axis = 2, level = 0.75)

## quantiles
## The 95 quantiles
quantiles(dummy_matrix)
## The 100 quantiles (which are equal to the ranges)
quantiles(dummy_matrix, quantile = 100) == ranges(dummy_matrix) # All TRUE

## The maximal radius of each axis (maximum distance from centre of each axis)

## ranges
## ranges of each column in a matrix
ranges(dummy_matrix)
## ranges of each column in the matrix corrected using the kth root
ranges(dummy_matrix, k.root = TRUE)

## roundness
## calculating the variance-covariance of the dummy_matrix
vcv <- var(dummy_matrix)
## calculating the roundness of it
roundness(vcv)
## calculating the roundness of the dummy matrix by calculating the vcv
roundness(dummy_matrix, vcv = TRUE)

## span.tree.length
## Minimum spanning tree length (default)
span.tree.length(dummy_matrix)
## Minimum spanning tree length from a distance matrix (faster)
distance <- as.matrix(dist(dummy_matrix))
span.tree.length(distance)
## Minimum spanning tree length based on Manhattan distance
span.tree.length(dummy_matrix, method = "manhattan")
span.tree.length(as.matrix(dist(dummy_matrix, method = "manhattan"))) # Same

## variances
## variances of a each column in the matrix
variances(dummy_matrix)
## variances of a each column in the matrix corrected using the kth root
variances(dummy_matrix, k.root = TRUE)

``````

dispRity documentation built on May 29, 2024, 9:40 a.m.