calcCompDist: Formula for the Complexity-Based Dissimilarity

Description Usage Arguments Value References

Description

Method which contains the formula used in compDist and compDistTSList. Considering the complexity to be a kind of entropy measure, our formula is similar to the normalized Variation of Information described by Meila (2003) and Wu, Xiong and Chen (2009), setting the joint entropy in relation to the single entropies. This results in a value from the interval [0,1], compared to (0.5,1] in the formula of Keogh et al. (2007). Our measure is symmetric.

Usage

1
calcCompDist(xLength, yLength, xyLength, yxLength)

Arguments

xLength

Length of the first string/time series after compression.

yLength

Length of the second string/time series after compression.

xyLength

Length of the concatenation of first and second string/time series after compression.

yxLength

Length of the concatenation of second and first string/time series after compression.

Value

The dissimilarity as numeric from the range [0,1].

References

Keogh, E., Lonardi, S., Ratanamahatana, C. A., Wei, L., Lee, S.-H. & Handley, J. (2007). Compression-based data mining of sequential data. Data Mining and Knowledge Discovery, 14(1), 99–129.

Li, M., Badger, J. H., Chen, X., Kwong, S., Kearney, P. & Zhang, H. (2001). An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics, 17(2), 149–154.

Meila, M. (2003). Comparing clusterings by the variation of information. In B. Schölkopf & M. K. Warmuth (Eds.), Learning theory and kernel machines: 16th annual conference on learning theory and 7th kernel workshop, colt/kernel 2003, washington, dc, usa, august 24-27, 2003. proceedings (pp. 173-187). Springer Berlin Heidelberg.

Wu, J., Xiong, H. & Chen, J. (2009). Adapting the right measures for k-means clustering. In Proceedings of the 15th acm sigkdd international conference on knowledge discovery and data mining (pp. 877-886). ACM.


Jakob-Bach/FastTSDistances documentation built on May 13, 2019, 1:15 p.m.