compound: Synthetic dataset of two-dimensional points.

Description Usage Format Source References

Description

This is a synthetic dataset that contains groups of different density points, varied shapes, and necks between partitions.

Usage

1

Format

A data frame containing 399 observations and two dimensions, forming six partitions:

  1. x1: synthetically generated real positive values

  2. x2: synthetically generated real positive values

Originally, the dataset had contained three dimensions. We intentionally removed the third dimension that corresponds to the label which the data point belongs. All description about the data set may be found in Graph-theoretical methods for detecting and describing gestalt clusters article, in the references.

Source

The dataset was collected from Clustering basic benchmark site.

References

C.T. Zahn, Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers, 1971. 100(1): p. 68-86.

P. Franti and S. Sieranoja, K-means properties on six clustering benchmark datasets, vol. 48, no. 12. pp. 4743-4759, 2018.


jairsonrodrigues/gama documentation built on May 17, 2019, 3:12 a.m.