Description Usage Arguments Details Value Author(s) References See Also Examples

Returns the E-distances (energy statistics) between clusters.

1 2 |

`x` |
data matrix of pooled sample or Euclidean distances |

`sizes` |
vector of sample sizes |

`distance` |
logical: if TRUE, x is a distance matrix |

`ix` |
a permutation of the row indices of x |

`alpha` |
distance exponent in (0,2] |

`method` |
how to weight the statistics |

A vector containing the pairwise two-sample multivariate
*E*-statistics for comparing clusters or samples is returned.
The e-distance between clusters is computed from the original pooled data,
stacked in matrix `x`

where each row is a multivariate observation, or
from the distance matrix `x`

of the original data, or distance object
returned by `dist`

. The first `sizes[1]`

rows of the original data
matrix are the first sample, the next `sizes[2]`

rows are the second
sample, etc. The permutation vector `ix`

may be used to obtain
e-distances corresponding to a clustering solution at a given level in
the hierarchy.

The default method `cluster`

summarizes the e-distances between
clusters in a table.
The e-distance between two clusters *C_i, C_j*
of size *n_i, n_j*
proposed by Szekely and Rizzo (2005)
is the e-distance *e(C_i,C_j)*, defined by

*e(S_i, S_j) = (n_i n_j)/(n_i+n_j)[2M_(ij)-M_(ii)-M_(jj)],*

where

*
M_{ij} = 1/(n_i n_j) sum[1:n_i, 1:n_j] ||X_(ip) - X_(jq)||^a,*

*|| ||* denotes Euclidean norm, *a=*
`alpha`

, and *
X_(ip)* denotes the p-th observation in the i-th cluster. The
exponent `alpha`

should be in the interval (0,2].

The coefficient *(n_i n_j)(n_i+n_j)*
is one-half of the harmonic mean of the sample sizes. The
`discoB`

method is related but with
different ways of summarizing the pairwise differences between samples.
The `disco`

methods apply the coefficient
*(n_i n_j)/(2N)* where N is the total number
of observations. This weights each (i,j) statistic by sample size
relative to N. See the `disco`

topic for more details.

A object of class `dist`

containing the lower triangle of the
e-distance matrix of cluster distances corresponding to the permutation
of indices `ix`

is returned. The `method`

attribute of the
distance object is assigned a value of type, index.

Maria L. Rizzo mrizzo @ bgsu.edu and Gabor J. Szekely

Szekely, G. J. and Rizzo, M. L. (2005) Hierarchical Clustering
via Joint Between-Within Distances: Extending Ward's Minimum
Variance Method, *Journal of Classification* 22(2) 151-183.

doi: 10.1007/s00357-005-0012-9

M. L. Rizzo and G. J. Szekely (2010).
DISCO Analysis: A Nonparametric Extension of
Analysis of Variance, Annals of Applied Statistics,
Vol. 4, No. 2, 1034-1055.

doi: 10.1214/09-AOAS245

Szekely, G. J. and Rizzo, M. L. (2004) Testing for Equal Distributions in High Dimension, InterStat, November (5).

Szekely, G. J. (2000) Technical Report 03-05,
*E*-statistics: Energy of
Statistical Samples, Department of Mathematics and Statistics,
Bowling Green State University.

`energy.hclust`

`eqdist.etest`

`ksample.e`

`disco`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ```
## compute cluster e-distances for 3 samples of iris data
data(iris)
edist(iris[,1:4], c(50,50,50))
## pairwise disco statistics
edist(iris[,1:4], c(50,50,50), method="discoB")
## compute e-distances from a distance object
data(iris)
edist(dist(iris[,1:4]), c(50, 50, 50), distance=TRUE, alpha = 1)
## compute e-distances from a distance matrix
data(iris)
d <- as.matrix(dist(iris[,1:4]))
edist(d, c(50, 50, 50), distance=TRUE, alpha = 1)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.