dispRity.metric: Disparity metrics

dispRity.metricR Documentation

Disparity metrics

Description

Different implemented disparity metrics.

Usage

dimension.level3.fun(matrix, ...)
dimension.level2.fun(matrix, ...)
dimension.level1.fun(matrix, ...)
between.groups.fun(matrix, matrix2, ...)

Arguments

matrix

A matrix.

...

Optional arguments to be passed to the function. Usual optional arguments are method for specifying the method for calculating distance passed to vegdist (e.g. method = "euclidean" - default - or method = "manhattan") or k.root to scale the result using the kth root. See details below for available optional arguments for each function.

matrix2

Optional, a second matrix for metrics between groups.

Details

These are inbuilt functions for calculating disparity. See make.metric for details on dimension.level3.fun, dimension.level2.fun, dimension.level1.fun and between.groups.fun. The dimensions levels (1, 2 and 3) can be seen as similar to ranks in linear algebra.

The currently implemented dimension-level 1 metrics are:

  • convhull.volume: calculates the convex hull hypervolume of a matrix (calls convhulln(x, options = "FA")$vol).

    • Both convhull functions call the convhulln function with the "FA" option (computes total area and volume).

    • WARNING: both convhull functions can be computationally intensive above 10 dimensions!

  • convhull.surface: calculates the convex hull hypersurface of a matrix (calls convhulln(x, options = "FA")$area).

  • diagonal: calculates the longest distance in the ordinated space.

    • WARNING: This function is the generalisation of Pythagoras' theorem and thus works only if each dimensions are orthogonal to each other.

  • ellipsoid.volume: calculates the ellipsoid volume of a matrix. This function tries to determine the nature of the input matrix and uses one of these following methods to calculate the volume. You can always specify the method using method = "my_choice" to overrun the automatic method choice.

    • "eigen": this method directly calculates the eigen values from the input matrix (using eigen). This method is automatically selected if the input matrix is "distance like" (i.e. square with two mirrored triangles and a diagonal).

    • "pca": this method calculates the eigen values as the sum of the variances of the matrix (abs(apply(var(matrix),2, sum))). This is automatically selected if the input matrix is NOT "distance like". Note that this method is faster than "eigen" but only works if the input matrix is an ordinated matrix from a PCA, PCO, PCoA, NMDS or MDS.

    • "axes": this method calculates the actual semi axes length using the input matrix. It is never automatically selected. By default this method calculates the length of the major axes based on the 0.95 confidence interval ellipse but this can be modified by providing additional arguments from axis.covar.

    • <a numeric vector>: finally, you can directly provide a numeric vector of eigen values. This method is never automatically selected and overrides any other options.

  • func.div: The functional divergence (Villeger et al. 2008): the ratio of deviation from the centroid (this is similar to FD::dbFD()$FDiv).

  • func.eve: The functional evenness (Villeger et al. 2008): the minimal spanning tree distances evenness (this is similar to FD::dbFD()$FEve). If the matrix used is not a distance matrix, the distance method can be passed using, for example method = "euclidean" (default).

  • mode.val: calculates the modal value of a vector.

  • n.ball.volume: calculate the volume of the minimum n-ball (if sphere = TRUE) or of the ellipsoid (if sphere = FALSE).

  • roundness: calculate the roundness of an elliptical representation of a variance-covariance matrix as the integral of the ranked distribution of the major axes. A value of 1 indicates a sphere, a value between 1 and 0.5 indicates a more pancake like representation and a value between 0.5 and 0 a more cigar like representation. You can force the variance-covariance calculation by using the option vcv = TRUE (default) that will calculate the variance-covariance matrix if the input is not one.

See also mean, median, sum or prod for commonly used summary metrics.

The currently implemented dimension-level 2 metrics are:

  • ancestral.dist: calculates the distance between each elements coordinates in the matrix and their ancestors' coordinates (if to.root = FALSE; default) or to the root coordinates (if to.root = TRUE) for a given tree. The distance is calculate as Euclidean by default but can be changed through the methods argument (method = "euclidean"; default). Note that the matrix must contain data for both tips and nodes in the tree, otherwise you must provide a matrix to the argument reference.data that contains them. Note that if the function is used in dispRity, both the tree and reference.data can be automatically recycled from the dispRity object (if present).

  • angles: calculates the angles of the main axis of variation per dimension in a matrix. The angles are calculated using the least square algorithm from the lm function. The unit of the angle can be changed through the unit argument (either "degree" (default), radian or slope) and a base angle to measure the angle from can be passed through the base argument (by default base = 0, measuring the angle from the horizontal line (note that the base argument has to be passed in the same unit as unit). When estimating the slope through lm, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).

  • centroids: calculates the distance between each row and the centroid of the matrix (Laliberte 2010). This function can take an optional arguments centroid for defining the centroid (if missing (default), the centroid of the matrix is used). This argument can be either a subset of coordinates matching the matrix's dimensions (e.g. c(0, 1, 2) for a matrix with three columns) or a single value to be the coordinates of the centroid (e.g. centroid = 0 will set the centroid coordinates to c(0, 0, 0) for a three dimensional matrix). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

  • deviations: calculates the minimal Euclidean distance between each element in and the hyperplane (or line if 2D, or a plane if 3D). You can specify equation of hyperplane of d dimensions in the intercept + ax + by + ... + nd = 0 format. For example the line y = 3x + 1 should be entered as c(1, 3, -1) or the plane x + 2y - 3z = 44 as c(44, 1, 2, -3). If missing the hyperplane (default) is calculated using a least square regression using a gaussian glm. Extra arguments can be passed to glm through .... When estimating the hyperplane, you can use the option significant to only consider significant slopes (TRUE) or not (FALSE - default).

  • displacements: calculates the ratio between the distance to the centroid (see centroids above) and the distance from a reference (by default the origin of the space). The reference can be changed through the reference argument. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

  • edge.length.tree: calculates the edge length from a given tree for each elements present in the matrix. Each edge length is either measured between the element and the root of the tree (to.root = TRUE ; default) or between the element and its last ancestor (to.root = FALSE))

  • neighbours: calculates the distance to a neighbour (Foote 1990). By default this is the distance to the nearest neighbour (which = min) but can be set to any dimension level - 1 function (e.g. which = mean gives the distance to the most average neighbour). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

  • pairwise.dist: calculates the pairwise distance between elements - calls vegdist(matrix, method = method, diag = FALSE, upper = FALSE, ...). The distance type can be changed via the method argument (see vegdist - default: method = "euclidean"). This function outputs a vector of pairwise comparisons in the following order: d(A,B), d(A,C), d(B,C) for three elements A, B and C. NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

  • projections: projects each element on a vector defined as (point1, point2) and measures some aspect of this projection. The different aspects that can be measured are:

    • measure = "position" (default), the distance of each element on the vector (point1, point2). Negative values means the element projects on the opposite direction of the vector (point1, point2).

    • measure = "distance", the euclidean distance of each element from the vector (point1, point2).

    • measure = "degree", the angle between the vector (point1, point2) and any vector (point1, element) in degrees.

    • measure = "radian", the angle between the vector (point1, point2) and any vector (point1, element) in radians.

    • measure = "orthogonality", the angle between the vector (point1, point2) and any vector (point1, element) expressed in right angle ranging between 0 (non angle) and 1 (right angle).

    By default, point1 is the centre of the space (coordinates 0, 0, 0, ...) and point2 is the centroid of the space (coordinates colMeans(matrix)). Coordinates for point1 and point2 can be given as a single value to be repeated (e.g. point1 = 1 is translated into point1 = c(1, 1, ...)) or a specific set of coordinates. Furthermore, by default, the space is scaled so that the vector (point1, point2) becomes the unit vector (distance (point1, point2) is set to 1; option scale = TRUE; default). You can use the unit vector of the space using the option scale = FALSE. Other options include the centering of the projections on 0.5 (centre = TRUE; default is set to FALSE) ranging the projection onto the vector (point1, point2) between -1 and 1 (higher or lower values project beyond the vector); and whether to output the projection values as absolute values (abs = FALSE; default is set to FALSE). These two last options only affect the results from measure = "position".

  • projections.tree: calculates the projections metric but drawing the vectors from a phylogenetic tree. This metric can intake any argument from projections (see above) but for point1 and point2 that are replaced by the argument type. type is a vector or a list of two elements that designates which vector to draw and can be any pair of the following options (the first element being the origin of the vector and the second where the vector points to):

    • "root": the root of the tree (the first element in tree$node.label);

    • "ancestor": the element's most recent ancestor;

    • "tips": the centroid of the tips;

    • "nodes": the centroid of all the nodes;

    • "livings": the centroid of the tips the furthest from the root;

    • "fossils": the centroid of all the tips that are not the furthest from the root;

    • any numeric values that can be interpreted as point1 and point2 in projections;

    • or a user defined function that with the inputs matrix and tree and row (the element's ID, i.e. the row number in matrix).

    NOTE: the elements to calculate the origin and end points of the vector are calculated by default on the provided input matrix which can be missing data from the tree if used with custom.subsets or chrono.subsets. You can always provide the full matrix using the option reference.data = my_matrix. Additional arguments include any arguments to be passed to projections (e.g. centre or abs).

  • quantiles: calculates the quantile range of each axis of the matrix. The quantile can be changed using the quantile argument (default is quantile = 95, i.e. calculating the range on each axis that includes 95% of the data). An optional argument, k.root, can be set to TRUE to scale the ranges by using its kth root (where k are the number of dimensions). By default, k.root = FALSE.

  • radius: calculates a distance from the centre of each axis. The type argument is the function to select which distance to calculate. By default type = max calculates the maximum distance between the elements and the centre for each axis (i.e. the radius for each dimensions)

  • ranges: calculates the range of each axis of the matrix (Wills 2001). An optional argument, k.root, can be set to TRUE to scale the ranges by using its kth root (where k are the number of dimensions). By default, k.root = FALSE.

  • variances: calculates the variance of each axis of the matrix (Wills 2001). This function can also take the k.root optional argument described above.

  • span.tree.length: calculates the length of the minimum spanning tree (see spantree). This function can get slow with big matrices. To speed it up, one can directly use distance matrices as the multidimensional space.

The currently implemented between.groups metrics are:

  • disalignment: calculates the rejection of a point from matrix from the major axis of matrix2. Options are, axis to choose which major axis to reject from (default is axis = 1); level for the ellipse' confidence interval (to calculate the axis) (default is level = 0.95) and point.to.reject, a numeric value for designating which point in matrix to use or a function for calculating it (default is point.to.reject = colMeans for matrix's centroid).

  • group.dist: calculates the distance between two groups (by default, this is the minimum euclidean vector norm distance between groups). Negative distances are considered as 0. This function must intake two matrices (matrix and matrix2) and the quantiles to consider. For the minimum distance between two groups, the 100th quantiles are considered (default: probs = c(0,1)) but this can be changed to any values (e.g. distance between the two groups accounting based on the 95th CI: probs = c(0.025, 0.975); distance between centroids: probs = c(0.5), etc...). This function is the linear algebra equivalent of the hypervolume::hypervolume_distance function.

  • point.dist: calculates the distance between matrix and a point calculated from matrix2. By default, this point is the centroid of matrix2. This can be changed by passing a function to be applied to matrix2 through the point argument (for example, for the centroid: point.dist(..., point = colMeans)). NOTE: distance is calculated as "euclidean" by default, this can be changed using the method argument.

  • projections.between: calculates the projection of the major axis between two matrices. It allows the same arguments as projections. This function measures the major axis from both input matrices, centre their origins and projects the end of the vector of matrix onto the vector from matrix2. Which axis to measure can be changed with the option axis (for the major axis, axis = 1; default) and the confidence interval can be changed using level (for the 95 confidence interval, level = 0.95; default - see axis.covar for more details).

When used in the dispRity function, optional arguments are declared after the metric argument: for example dispRity(data, metric = centroids, centroid = 0, method = "manhattan")

Author(s)

Thomas Guillerme

References

Donohue I, Petchey OL, Montoya JM, Jackson AL, McNally L, Viana M, Healy K, Lurgi M, O'Connor NE, Emmerson MC. 2013. On the dimensionality of ecological stability. Ecology letters. 16(4):421-9.

Lalibert'e E, Legendre P. 2010. A distance-based framework for measuring functional diversity from multiple traits. Ecology, 91(1), pp.299-305.

Vill'eger S, Mason NW, Mouillot D. 2008. New multidimensional functional diversity indices for a multifaceted framework in functional ecology. Ecology. 89(8):2290-301.

Wills MA. 2001. Morphological disparity: a primer. In Fossils, phylogeny, and form (pp. 55-144). Springer, Boston, MA.

Foote, M. 1990. Nearest-neighbor analysis of trilobite morphospace. Systematic Zoology, 39(4), pp.371-382.

See Also

dispRity and make.metric.

Examples

## A random matrix
dummy_matrix <- matrix(rnorm(90), 9, 10)

## ancestral.dist
## A random tree with node labels
rand_tree <- rtree(5) ; rand_tree$node.label <- paste0("n", 1:4)
## Adding the tip and node names to the matris
rownames(dummy_matrix) <- c(rand_tree$tip.label, rand_tree$node.label)
## Calculating the distances to the ancestors
ancestral.dist(dummy_matrix, tree = rand_tree)
## Calculating the manhattan distances to the root
ancestral.dist(dummy_matrix, tree = rand_tree,
               to.root = TRUE, method = "manhattan")

## angles
## The angles in degrees of each axis
angles(dummy_matrix)
## The angles in slope from the 1:1 slope (Beta = 1)
angles(dummy_matrix, unit = "slope", base = 1)

## centroids
## Distances between each row and centroid of the matrix
centroids(dummy_matrix)
## Distances between each row and an arbitrary point
centroids(dummy_matrix, centroid = c(1,2,3,4,5,6,7,8,9,10))
## Distances between each row and the origin
centroids(dummy_matrix, centroid = 0)

## convhull.surface
## Making a matrix with more elements than dimensions (for convhull)
thinner_matrix <- matrix(rnorm(90), 18, 5)
## Convex hull hypersurface of a matrix
convhull.surface(thinner_matrix)

## convhull.volume
## Convex hull volume of a matrix
convhull.volume(thinner_matrix)

## deviations
## The deviations from the least square hyperplane
deviations(dummy_matrix)
## The deviations from the plane between the x and y axis
deviations(dummy_matrix, hyperplane = c(0,1,1,0,0,0,0,0,0,0,0))

## diagonal
## Matrix diagonal
diagonal(dummy_matrix) # WARNING: only valid if the dimensions are orthogonal

## disalignment
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Measuring the disalignment of matrix_1 from matrix_2
disalignment(matrix_1, matrix_2)
## Same but using the 2nd major axis of the 0.75 CI ellipse
## from matrix_2 and the first point from matrix_1.
disalignment(matrix_1, matrix_2,
             axis = 2, level = 0.75,
             point.to.reject = 1)

## displacements
## displacement ratios (from the centre)
displacements(dummy_matrix)
## displacement ratios (from an arbitrary point)
displacements(dummy_matrix, reference = c(1,2,3,4,5,6,7,8,9,10))
## displacement ratios from the centre (manhattan distance)
displacements(dummy_matrix, method = "manhattan")

## edge.length.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The total edge length of each element in the matrix (to the root)
edge.length.tree(named_matrix, tree = dummy_tree)

## The edge lengths for each edge leading to the elements in the matrix
edge.length.tree(named_matrix, tree = dummy_tree, to.root = FALSE)

## ellipsoid.volume
## Ellipsoid volume of a matrix
ellipsoid.volume(dummy_matrix)
## Calculating the same volume with provided eigen values
ordination <- prcomp(dummy_matrix)
## Calculating the ellipsoid volume by providing your own eigen values
ellipsoid.volume(ordination$x, method = ordination$sdev^2)

## func.div
## Functional divergence
func.div(dummy_matrix)

## func.eve
## Functional evenness
func.eve(dummy_matrix) 
## Functional evenness (based on manhattan distances)
func.eve(dummy_matrix, method = "manhattan")

## group.dist
## The distance between groups
dummy_matrix2 <- matrix(runif(40, min = 2, max = 4), 4, 10)
## The minimum distance between both groups
group.dist(dummy_matrix, dummy_matrix2)
## The distance between both groups' centroids
group.dist(dummy_matrix, dummy_matrix2, probs = 0.5)
## The minimum distance between the 50% CI of each group
group.dist(dummy_matrix, dummy_matrix2, probs = c(0.25, 0.75))

## mode.val
## Modal value of a vector
mode.val(dummy_matrix)

## neighbours
## The nearest neighbour euclidean distances
neighbours(dummy_matrix)
## The furthest neighbour manhattan distances
neighbours(dummy_matrix, which = max, method = "manhattan")

## pairwise.dist
## The pairwise distance
pairwise.dist(dummy_matrix)
## The average squared pairwise distance
mean(pairwise.dist(dummy_matrix)^2)
## equal to:
# geiger::disparity(data = dummy_matrix)

## point.dist
## The distances from the rows dummy_matrix
## to the centroids of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2)
## The average distances from dummy_matrix
## to the centroids of dummy_matrix2
mean(point.dist(dummy_matrix, dummy_matrix2))
## The manhattan distance from the rows dummy_matrix
## to the standard deviation of dummy_matrix2
point.dist(dummy_matrix, dummy_matrix2, point = sd, method = "manhattan")

## projections
## The distances on the vector defined from the centre of
## the matrix to its centroid (default)
projections(dummy_matrix)
## The distances from the vector defined from the third
## element of the matrix to the point of coordinated
## c(1,1,1, ...) the matrix to its centroid (default)
projections(dummy_matrix, measure = "distance",
            point1 = dummy_matrix[3, ],
            point2 = 1)

## projections.tree
## Making a dummy tree with node labels
dummy_tree <- makeNodeLabel(rtree((nrow(dummy_matrix)/2)+1))
## Naming the elements in the matrix
named_matrix <- dummy_matrix
rownames(named_matrix) <- c(dummy_tree$tip.label,
                            dummy_tree$node.label)
## The projection on the vector defined from the root of
## the tree to the ancestor of each element in the matrix
projections.tree(named_matrix, dummy_tree,
                  type = c("root", "ancestor"))
## The rejection from the vector defined from the centroid
## of the nodes to the centroids of the tips
projections.tree(named_matrix, dummy_tree,
                  type = c("nodes", "tips"),
                  measure = "distance")
## A user function that define coordinates based on the 
## centroid of the three first nodes
user.fun <- function(matrix, tree, row = NULL) {
     return(colMeans(matrix[tree$node.label[1:3], ]))
}
## The projection on the vector defined by the coordinates
## 0,0,0 and a user defined function
projections.tree(named_matrix, dummy_tree,
                  type = c(0, user.fun))

## projections.between
## Two dummy matrices
matrix_1 <- matrix(rnorm(16), 4, 4)
matrix_2 <- matrix(rnorm(16), 4, 4)
## Projecting the major axis of matrix_2 onto the one from matrix_1
projections.between(matrix_1, matrix_2)
## Projecting both second major 0.75 axes
## and getting the rejections (see projections() for option details)
projections.between(matrix_1, matrix_2,
                    measure = "distance",
                    axis = 2, level = 0.75)

## quantiles
## The 95 quantiles
quantiles(dummy_matrix)
## The 100 quantiles (which are equal to the ranges)
quantiles(dummy_matrix, quantile = 100) == ranges(dummy_matrix) # All TRUE

## radius
## The maximal radius of each axis (maximum distance from centre of each axis)
radius(dummy_matrix)

## ranges
## ranges of each column in a matrix
ranges(dummy_matrix)
## ranges of each column in the matrix corrected using the kth root
ranges(dummy_matrix, k.root = TRUE)

## roundness
## calculating the variance-covariance of the dummy_matrix
vcv <- var(dummy_matrix)
## calculating the roundness of it
roundness(vcv)
## calculating the roundness of the dummy matrix by calculating the vcv
roundness(dummy_matrix, vcv = TRUE)

## span.tree.length
## Minimum spanning tree length (default)
span.tree.length(dummy_matrix)
## Minimum spanning tree length from a distance matrix (faster)
distance <- as.matrix(dist(dummy_matrix))
span.tree.length(distance)
## Minimum spanning tree length based on Manhattan distance
span.tree.length(dummy_matrix, method = "manhattan")
span.tree.length(as.matrix(dist(dummy_matrix, method = "manhattan"))) # Same

## variances
## variances of a each column in the matrix
variances(dummy_matrix)
## variances of a each column in the matrix corrected using the kth root
variances(dummy_matrix, k.root = TRUE)



TGuillerme/dispRity documentation built on April 17, 2024, 10 p.m.