mahalanobis.dist: Computes the Mahalanobis Distance

View source: R/mahalanobis.dist.R

mahalanobis.distR Documentation

Computes the Mahalanobis Distance

Description

This function computes the Mahalanobis distance among units in a dataset or between observations in two distinct datasets.

Usage

mahalanobis.dist(data.x, data.y=NULL, vc=NULL)

Arguments

data.x

A matrix or a data frame containing variables that should be used in the computation of the distance between units. Only continuous variables are allowed. Missing values (NA) are not allowed.

When only data.x is supplied, the distances between rows of data.x is computed.

data.y

A numeric matrix or data frame with the same variables, of the same type, as those in data.x (only continuous variables are allowed). Dissimilarities between rows of data.x and rows of data.y will be computed. If not provided, by default it is assumed data.y=data.x and only dissimilarities between rows of data.x will be computed.

vc

Covariance matrix that should be used in distance computation. If it is not supplied (vc = NULL) it is estimated from the input data. In particular, when vc = NULL and only data.x is supplied then the covariance matrix is estimated from data.x (i.e. vc = var(data.x)). On the contrary when vc = NULL and both data.x and data.y are available then the covariance matrix is estimated on the joined data sets (i.e. vc = var(rbind(data.x, data.y))).

Details

The Mahalanobis distance is calculated by means of:

d(i,j)=\sqrt{(x_i - x_j)^T S^{-1} (x_i - x_j)}

The covariance matrix S is estimated from the available data when vc=NULL, otherwise the one supplied via the argument vc is used.

Value

A matrix object with distances among rows of data.x and those of data.y.

Author(s)

Marcello D'Orazio mdo.statmatch@gmail.com

References

Mahalanobis, P C (1936) “On the generalised distance in statistics”. Proceedings of the National Institute of Sciences of India 2, pp. 49-55.

See Also

mahalanobis

Examples


md1 <- mahalanobis.dist(iris[1:6,1:4])
md2 <- mahalanobis.dist(data.x=iris[1:6,1:4], data.y=iris[51:60, 1:4])

vv <- var(iris[,1:4])
md1a <- mahalanobis.dist(data.x=iris[1:6,1:4], vc=vv)
md2a <- mahalanobis.dist(data.x=iris[1:6,1:4], data.y=iris[51:60, 1:4], vc=vv)


StatMatch documentation built on May 29, 2024, 2:15 a.m.