README.md

Clusty

Overview

The clusty package is used for evaluating distance-based clustering with non-overlapping cluster membership in R programming language. Specifically, it was designed for assessing k-means clustering with distance matrix heat maps, as the objective function is based on distance. Effective clustering is thus where instances within clusters are significantly similar (small distance) and instances between clusters are significantly differentiated (large distance). Using a heat map of cluster distances, a highly ranked diagonal in the summaryheat function corresponds to strong intra-cluster homogeneity while highly ranked upper or lower squares in the triangle of the heat map corresponds to strong inter-cluster heterogeneity. The distance metrics used in this heat map can be extracted using bigextract. bigheat uses condensed instance vectors to visualize the clustering (i.e. intra-cluster homogeneity and inter-cluster heterogeneity) at the instance level. This provides a granular look at how well differentiated instances are within and between clusters and permits row reduction of large datasets into condensed instance vectors.

Functions and their uses

bigheat_samples

summaryheat_sample

summaryheatp

Use Package in R

install.packages("devtools") install_github("lukadw11/clusty")

License

This package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 3, as published by the Free Software Foundation. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details. A copy of the GNU General Public License, version 3, is available at http://www.r-project.org/Licenses/GPL-3



lukadw11/Clusty documentation built on May 21, 2019, 8:57 a.m.