# cluster.evaluation: Clustering Evaluation Index Based on Known Ground Truth In TSclust: Time Series Clustering Utilities

## Description

Computes the similarity between the true cluster solution and the one obtained with a method under evaluation.

## Usage

 1 cluster.evaluation(G, S) 

## Arguments

 G Integer vector with the labels of the true cluster solution. Each element of the vector specifies the cluster 'id' that the element belongs to. S Integer vector with the labels of the cluster solution to be evaluated. Each element of the vector specifies the cluster 'id' that the element belongs to.

## Details

The measure of clustering evaluation is defined as

Sim(G,C) = 1/k ∑_{i=1}^k \max_{1≤q j≤q k} Sim(G_i,C_j),

where

Sim(G_i, C_j) = \frac{ 2 | G_i \cap C_j|}{ |G_i| + |C_j|}

with |.| denoting the cardinality of the elements in the set. This measure has been used for comparing different clusterings, e.g. in Kalpakis et al. (2001) and Pértega and Vilar (2010).

## Value

The computed index.

## Note

This index is not simmetric.

## Author(s)

Pablo Montero Manso, José Antonio Vilar.

## References

Larsen, B. and Aone, C. (1999) Fast and effective text mining using linear-time document clustering. Proc. KDD' 99.16–22.

Kalpakis, K., Gada D. and Puttagunta, V. (2001) Distance measures for effective clustering of arima time-series. Proceedings 2001 IEEE International Conference on Data Mining, 273–280.

Pértega S. and Vilar, J.A (2010) Comparing several parametric and nonparametric approaches to time series clustering: A simulation study. J. Classification, 27(3), 333-362.

Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i01/.

cluster.stats, clValid, std.ext
  1 2 3 4 5 6 7 8 9 10 11  #create a true cluster #(first 4 elements belong to cluster '1', next 4 to cluster '2' and the last 4 to cluster '3'. true_cluster <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3) #the cluster to be tested new_cluster <- c( 2, 1, 2, 3, 3, 2, 2, 1, 3, 3, 3, 3) #get the index cluster.evaluation(true_cluster, new_cluster) #it can be seen that the index is not simmetric cluster.evaluation(new_cluster, true_cluster)