is_valid_cluster: Determine whether a cluster is valid

If a cluster minimizes distance between points, then it is considered to be a valid cluster.



An n x 2 matrix of points


A vector of group associations


A logical value is returned indicating whether the specified groups are considered valid.


is_valid_cluster is_valid_cluster(z, groups)


With hierarchical clustering techniques it is difficult to know whether the set of clusters produced are valid. Typically this is left to interpretation and must be 'eye-balled' to choose the cutoff point as well as decide whether the cluster boundaries make sense.

For an automated system, such a manual decision point is undesirable and must be replaced by automatic process. Since this data is multidimensional, one approach is to use a distance metric or other mathematical property as a heuristic. This function uses accepts a group of clusters if the sum of the variances of the distance within each cluster is less than the variance of the distance as a single cluster.


Brian Lee Yung Rowe

