sdtw: Soft-DTW distance

View source: R/DISTANCES-sdtw.R

sdtwR Documentation

Soft-DTW distance

Description

Soft-DTW distance measure as proposed in Cuturi and Blondel (2017).

Usage

sdtw(x, y, gamma = 0.01, ..., error.check = TRUE)

Arguments

x, y

Time series. Multivariate series must have time spanning the rows and variables spanning the columns.

gamma

Positive regularization parameter, with lower values resulting in less smoothing.

...

Currently ignored.

error.check

Logical indicating whether the function should try to detect inconsistencies and give more informative errors messages. Also used internally to avoid repeating checks.

Details

Unlike other distances, soft-DTW can return negative values, and sdtw(x, x) is not always equal to zero. Like DTW, soft-DTW does not fulfill the triangle inequality, but it is always symmetric.

Value

The Soft DTW distance.

Proxy version

The version registered with proxy::dist() is custom (loop = FALSE in proxy::pr_DB). The custom function handles multi-threaded parallelization directly with RcppParallel. It uses all available threads by default (see RcppParallel::defaultNumThreads()), but this can be changed by the user with RcppParallel::setThreadOptions().

An exception to the above is when it is called within a foreach parallel loop made by dtwclust. If the parallel workers do not have the number of threads explicitly specified, this function will default to 1 thread per worker. See the parallelization vignette for more information - browseVignettes("dtwclust")

It also includes symmetric optimizations to calculate only half a distance matrix when appropriate—only one list of series should be provided in x. Starting with version 6.0.0, this optimization means that the function returns an array with the lower triangular values of the distance matrix, similar to what stats::dist() does; see DistmatLowerTriangular for a helper to access elements as it if were a normal matrix. If you want to avoid this optimization, call proxy::dist by giving the same list of series in both x and y.

Note that, due to the fact that this distance is not always zero when a series is compared against itself, this optimization is likely problematic for soft-DTW, as the dist object will be handled by many functions as if it had only zeroes in the diagonal. An exception is tsclust() when using partitional clustering with PAM centroids—actual diagonal values will be calculated and considered internally in that case.

References

Cuturi, M., & Blondel, M. (2017). Soft-DTW: a Differentiable Loss Function for Time-Series. arXiv preprint arXiv:1703.01541.


dtwclust documentation built on Sept. 11, 2024, 9:07 p.m.