Description Usage Arguments Details Value Author(s) References Examples
Function to compute the Hopkins statistic for datasets given a certain sample size. Indicates the cluster tendency in data
1 | Hopkins(dataset, sample_size=0.1)
|
dataset |
The dataset for which a Hopkins statistic is returned |
sample_size |
The sample size as a proportion of the total number of observations in data. The greater the sample size, the more accurate Hopkins statistic is produced. Increased sample size has exponential increased complexity |
The Hopkins statistic is useful as a test for cluster tendency in data. By creating a uniform distribution in data space, the distance to nearest original data point is calculated. The sum of distance to original data points is compared to sum of distance between original data points. The function returns an index between 0 and 1, where 1 characterize data partitioned in clusters, 0.5 characterize random uniformly distributed data and 0 characterize random data
The Hopkins statistic
Jacob H. Madsen
Hopkins, B. (1954). A New Method for determining the Type of Distribution of Plant Individuals. Annals of Botany. Vol. 18, pp. 213–227
1 2 3 4 5 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.