Dist: Distance matrix - Sum of all pairwise distances in a distance...

View source: R/Dist.R

Distance matrix - Sum of all pairwise distances in a distance matrixR Documentation

Distance matrix - Sum of all pairwise distances in a distance matrix

Description

Distance matrix - Sum of all pairwise distances in a distance matrix.

Usage

Dist(x,method = "euclidean", square = FALSE,p=0, 
    result = "matrix" ,vector = FALSE, parallel = FALSE)
total.dist(x, method = "euclidean", square = FALSE, p = 0)
vecdist(x)

Arguments

x

A matrix with data. The distances will be calculated between pairs of rows. In the case of vecdist this is a vector. For the haversine distance it must be a matrix with two columns, the first column is the latitude and the second the longitude (in radians).

method

See details for the available methods.

square

If you choose "euclidean" or "hellinger" as the method, then you can have the option to return the squared Euclidean distances by setting this argument to TRUE.

p

This is for the Minkowski method, the power of the metric.

vector

For return a vector instead a matrix.

result

One of the:

  • "matrix" : Return the result as matrix.

  • "vector" : Return the result as vector.

  • "sum" : Return the sum of the result.

parallel

For methods euclidean, canberra and minkowski, you can run the algorithm in parallel.

Details

The distance matrix is compute with an extra argument for the Euclidean distances. The "kullback_leibler" refers to the symmetric Kullback-Leibler divergence.

  • euclidean : \sum \sqrt( \sum | P_i - Q_i |^2)

  • manhattan : \sum | P_i - Q_i |

  • minimum : \sum \min | P_i - Q_i |

  • maximum : \sum \max | P_i - Q_i |

  • minkowski : ( \sum | P_i - Q_i |^p)^{\frac{1}{p}}

  • bhattacharyya : - ln (\sum \sqrt(P_i * Q_i))

  • hellinger : 2 * \sqrt( 1 - \sum \sqrt(P_i * Q_i))

  • kullback_leibler : \sum P_i * log(\frac{P_i}{Q_i})

  • jensen_shannon : 0.5 * ( \sum P_i * log(2 * \frac{P_i}{Q_i + Q_i}) + \sum Q_i * log(2 * \frac{Q_i}{P_i + Q_i}))

  • canberra : \sum \frac{| P_i - Q_i |}{P_i + Q_i}

  • chi_square X^2 : \sum (\frac{(P_i - Q_i )^2}{P_i + Q_i})

  • soergel : \frac{\sum | P_i - Q_i |}{\sum \max(P_i , Q_i)}

  • sorensen : \frac{\sum | P_i - Q_i |}{\sum (P_i + Q_i)}

  • cosine : \sum \frac{\sum (P_i * Q_i)}{\sqrt(\sum P_i^2) * \sqrt(\sum Q_i^2)}

  • wave_hedges : \sum \frac{\sum | P_i - Q_i |}{\max(P_i , Q_i)}

  • motyka : \sum \frac{\min(P_i, Q_i)}{(P_i + Q_i)}

  • harmonic_mean : 2 * \frac{\sum P_i * Q_i}{P_i + Q_i}

  • jeffries_matusita : \sum \sqrt( 2 - 2 * \sum \sqrt(P_i * Q_i))

  • gower : \sum \frac{1}{d} * \sum | P_i - Q_i |

  • kulczynski : \sum \frac{\sum | P_i - Q_i |}{\sum \min(P_i , Q_i)}

  • itakura_saito : \sum \frac{P_i}{Q_i} - log(\frac{P_i}{Q_i}) - 1

  • haversine : 2 * R * \arcsin(\sqrt(\sin((lat_2 - lat_1)/2)^2 + \cos(lat_1) * \cos(lat_2) * \sin((lon_2 - lon_1)/2)^2))

Value

A square matrix with the pairwise distances.

Author(s)

Manos Papadakis.

R implementation and documentation: Manos Papadakis <papadakm95@gmail.com>.

References

Mardia K. V., Kent J. T. and Bibby J. M. (1979). Multivariate Analysis. Academic Press.

See Also

dista, colMedians

Examples

x <- matrix(rnorm(50 * 10), ncol = 10)
a1 <- Dist(x)
a2 <- as.matrix( dist(x) )

x<-a1<-a2<-NULL

Rfast documentation built on April 3, 2025, 11:34 p.m.