shape_extraction: Shape average of several time series In dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

 shape_extraction R Documentation

Shape average of several time series

Description

Time-series shape extraction based on optimal alignments as proposed by Paparrizos and Gravano (2015) for the k-Shape clustering algorithm.

Usage

```shape_extraction(X, centroid = NULL, znorm = FALSE, ..., error.check = TRUE)
```

Arguments

 `X` A matrix or data frame where each row is a time series, or a list where each element is a time series. Multivariate series should be provided as a list of matrices where time spans the rows and the variables span the columns. `centroid` Optionally, a time series to use as reference. Defaults to a random series of `X` if `NULL`. For multivariate series, this should be a matrix with the same characteristics as the matrices in `X`. It will be z-normalized. `znorm` Logical flag. Should z-scores be calculated for `X` before processing? `...` Further arguments for `zscore()`. `error.check` Logical indicating whether the function should try to detect inconsistencies and give more informative errors messages. Also used internally to avoid repeating checks.

Details

This works only if the series are z-normalized, since the output will also have this normalization.

The resulting centroid will have the same length as `centroid` if provided. Otherwise, there are two possibilities: if all series from `X` have the same length, all of them will be used as-is, and the output will have the same length as the series; if series have different lengths, a series will be chosen at random and used as reference. The output series will then have the same length as the chosen series.

This centroid computation is cast as an optimization problem called maximization of Rayleigh Quotient. It depends on the `SBD()` algorithm. See the cited article for more details.

Value

Centroid time series (z-normalized).

References

Paparrizos J and Gravano L (2015). “k-Shape: Efficient and Accurate Clustering of Time Series.” In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, series SIGMOD '15, pp. 1855-1870. ISBN 978-1-4503-2758-9, doi: 10.1145/2723372.2737793.

`SBD()`, `zscore()`

Examples

```
# Sample data
data(uciCT)

# Normalize desired subset
X <- zscore(CharTraj[1:5])

# Obtain centroid series
C <- shape_extraction(X)

# Result
matplot(do.call(cbind, X),
type = "l", col = 1:5)
points(C)

```

dtwclust documentation built on March 7, 2023, 7:49 p.m.