distMeans: Mean of Distances

distMeansR Documentation

Mean of Distances

Description

Mean of distances is defined as the distance of each point to the mean or expected value of coordinates generating the distances.

Usage

distMeans(...)

## Default S3 method:
distMeans(d, addcentre = FALSE, label = "centroid", ...)

## S3 method for class 'formula'
distMeans(formula, data, ...)

Arguments

...

Other parameters (ignored).

d

Distances as a dist object

addcentre

Add distances to the centroid as the first item in the distance matrix. If FALSE only return mean distances.

label

Label for the centroid when addcentre = TRUE.

formula, data

Formula where the left-hand-side is the dissimilarity structure, and right-hand-side defines the mean from which the dissimilarities are calculated. The terms in the right-hand-side can be given in data.

Details

Function is analagous to colMeans or rowMeans and returns values that are at the mean of distances of each row or column of a symmetric distance matrix. Alternatively, the use of formula calculates mean distances to the fitted values.

Means of distances cannot be directly found as marginal means of distance matrix, but they must be found after Gower double centring (Gower 1966). After double centring, the means are zero, and when backtransformed to original distances, these give the mean distances. When added to the original distances, the metric properties are preserved. For instance, adding centres to distances will not influence results of metric scaling, or the rank of spatial Euclidean distances. The method is based on Euclidean geometry, but also works for non-Euclidean dissimilarities. However, the means of very strongly non-Euclidean indices may be imaginary, and given as NaN.

Average mean distances can be regarded as a measure of beta diversity, and formula interface allows analysis of beta diversity within factor levels or with covariates. Such analysis is preferable to conventional averaging of dissimilarities or regression analysis of dissimilarities. Analysis of mean distances is consistent with directly grouping observed rectangular data, and overall beta diversity can be decomposed into components defined by the formula, and handles inflating n observations to n(n-1)/2 dissimilarities in analysis.

Value

Distances to all other points from a point that is in the centroid or fitted value (with formula interface) of the coordinates generating the distances. Default method allows returning the input dissimilarity matrix where the mean distances are added as the first observation.

Author(s)

Jari Oksanen.

References

Gower, J.C. (1966) Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53, 325-328.

Examples

## Euclidean distances to the mean of coordinates ...
xy <- matrix(runif(5*2), 5, 2)
dist(rbind(xy, "mean" = colMeans(xy)))
## ... are equal to distMeans ...
distMeans(dist(xy))
## ... but different from mean of distances
colMeans(as.matrix(dist(xy)))
## adding mean distance does not influence PCoA of non-Euclidean
## distances (or other metric properties)
data(spurn)
d <- canneddist(spurn, "bray")
m0 <- cmdscale(d, eig = TRUE)
mcent <- cmdscale(distMeans(d, addcentre=TRUE), eig = TRUE)
## same non-zero eigenvalues
zapsmall(m0$eig)
zapsmall(mcent$eig)
## distMeans are at the origin of ordination
head(mcent$points)

jarioksa/natto documentation built on March 28, 2024, 12:45 a.m.