FastTSDistances: Fast dissimilarity computations for time series

Description Usage Arguments Value References See Also

Dissimilarity based on the length of the compressed single as well as concatenated time series as described by Li et al. (2001). Time series are represented with SAX first and then zipped, both according to Keogh et al. (2007). As an improvement, the dissimilarity is scaled to the interval [0,1] (before: (0.5,1]) and made symmetric. Multi-variate time series are handled by attribute concatenation.

1	compDist(x, y, symbolCount = 8, symbolLimits = NULL)

`x`	1st numeric vector/matrix (uni- or multi-variate time series).
`y`	2nd numeric vector/matrix (uni- or multi-variate time series).
`symbolCount`	Number of SAX symbols. Boundaries for the intervals will be determined based on the standard normal distribution. As an alternative, you can supply the boundaries directly.
`symbolLimits`	Interval boundaries which will be used to convert the time series to a SAX representation. Should be a monotonically increasing vector starting with -Inf and ending with +Inf. The parameter `symbolCount` is ignored if you supply a value here.

The dissimilarity as numeric from the range [0,1].

Keogh, E., Lonardi, S., Ratanamahatana, C. A., Wei, L., Lee, S.-H. & Handley, J. (2007). Compression-based data mining of sequential data. Data Mining and Knowledge Discovery, 14(1), 99–129.

Li, M., Badger, J. H., Chen, X., Kwong, S., Kearney, P. & Zhang, H. (2001). An information-based sequence distance and its application to whole mitochondrial genome phylogeny. Bioinformatics, 17(2), 149–154.

Other compression-based distances: compDistTSList

Jakob-Bach/FastTSDistances documentation built on May 13, 2019, 1:15 p.m.