Description Details Author(s) References See Also

The input argument `k`, represents the number of clusters is needed to start all the partitioning clustering algorithms. In unsupervised learning applications, an optimal value of this argument is widely determined by using the internal validity indexes. Since these indexes suggest a `k` value which is computed on the clustering results obtained with several runs of a clustering algorithm, they are computationally expensive. On the contrary, the package 'kpeaks' enables to estimate `k` before running any clustering algorithm. It is based on a simple novel technique using the descriptive statistics of peak counts of the features in a dataset.

The package 'kpeaks' contains five functions and one synthetically created dataset for testing purposes. In order to suggest an estimate of `k`, the function `findk`

internally calls the functions `genpolygon`

and `findpolypeaks`

, respectively. The frequency polygons can be visually inspected by using the function `plotpolygon`

. Using the function `rmshoulders`

is recommended to flatten or remove the the shoulder peaks around the main peaks of a frequency polygon, if any.

Zeynel Cebeci, Cagatay Cebeci

Cebeci, Z. & Cebeci, C. (2018). "A novel technique for fast determination of K in partitioning cluster analysis", *Journal of Agricultural Informatics*, 9(2), 1-11.
doi: 10.17700/jai.2018.9.2.442.

Cebeci, Z. & Cebeci, C. (2018). "kpeaks: An R Package for Quick Selection of K for Cluster Analysis", In *2018 International Conference on Artificial Intelligence and Data Processing (IDAP)*, IEEE.
doi: 10.1109/IDAP.2018.8620896.

`findk`

,
`findpolypeaks`

,
`genpolygon`

,
`plotpolygon`

,
`rmshoulders`

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.