TADPole: TADPole clustering
In dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole

R Documentation

TADPole clustering

Description

Time-series Anytime Density Peaks Clustering as proposed by Begum et al. (2015).

Usage

TADPole(
  data,
  k = 2L,
  dc,
  window.size,
  error.check = TRUE,
  lb = "lbk",
  trace = FALSE
)

tadpole(
  data,
  k = 2L,
  dc,
  window.size,
  error.check = TRUE,
  lb = "lbk",
  trace = FALSE
)

Arguments

`data`	A matrix or data frame where each row is a time series, or a list where each element is a time series. Multivariate series are not supported.
`k`	The number of desired clusters. Can be a vector with several values.
`dc`	The cutoff distance(s). Can be a vector with several values.
`window.size`	Window size constraint for DTW (Sakoe-Chiba). See details.
`error.check`	Logical indicating whether the function should try to detect inconsistencies and give more informative errors messages. Also used internally to avoid repeating checks.
`lb`	Which lower bound to use, "lbk" for `lb_keogh()` or "lbi" for `lb_improved()`.
`trace`	Logical flag. If `TRUE`, more output regarding the progress is printed to screen.

Details

This function can be called either directly or through tsclust().

TADPole clustering adopts a relatively new clustering framework and adapts it to time series clustering with DTW. See the cited article for the details of the algorithm.

Because of the way the algorithm works, it can be considered a kind of Partitioning Around Medoids (PAM). This means that the cluster centroids are always elements of the data. However, this algorithm is deterministic, depending on the value of dc.

The algorithm first uses the DTW's upper and lower bounds (Euclidean and LB_Keogh respectively) to find series with many close neighbors (in DTW space). Anything below the cutoff distance (dc) is considered a neighbor. Aided with this information, the algorithm then tries to prune as many DTW calculations as possible in order to accelerate the clustering procedure. The series that lie in dense areas (i.e. that have lots of neighbors) are taken as cluster centroids.

The algorithm relies on the DTW bounds, which are only defined for univariate time series of equal length.

Parallelization is supported in the following way:

For multiple dc values, multi-processing with foreach::foreach().
The internal distance calculations use multi-threading with RcppParallel::RcppParallel.

The windowing constraint uses a centered window. The calculations expect a value in window.size that represents the distance between the point considered and one of the edges of the window. Therefore, if, for example, window.size = 10, the warping for an observation x_i considers the points between x_{i-10} and x_{i+10}, resulting in 10(2) + 1 = 21 observations falling within the window.

Value

A list with:

cl: Cluster indices.
centroids: Indices of the centroids.
distCalcPercentage: Percentage of distance calculations that were actually performed.

For multiple k/dc values, a list of lists is returned, each internal list having the aforementioned elements.

Parallel Computing

Please note that running tasks in parallel does not guarantee faster computations. The overhead introduced is sometimes too large, and it's better to run tasks sequentially.

This function uses the RcppParallel package for parallelization. It uses all available threads by default (see RcppParallel::defaultNumThreads()), but this can be changed by the user with RcppParallel::setThreadOptions().

An exception to the above is when it is called within a foreach parallel loop made by dtwclust. If the parallel workers do not have the number of threads explicitly specified, this function will default to 1 thread per worker. See the parallelization vignette for more information - browseVignettes("dtwclust")

References

Begum N, Ulanova L, Wang J and Keogh E (2015). “Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy.” In Conference on Knowledge Discovery and Data Mining, series KDD '15. ISBN 978-1-4503-3664-2/15/08, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1145/2783258.2783286")}.

dtwclust documentation built on Sept. 11, 2024, 9:07 p.m.

dtwclust index

Package overview Comparing Time-Series Clustering Algorithms in R Using the dtwclust Package Parallelization considerations for dtwclust Timing experiments for dtwclust

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dtwclust
Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole: TADPole clustering
In dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole clustering

Description

Usage

Arguments

Details

Value

Parallel Computing

References

Related to TADPole in dtwclust...

R Package Documentation

Browse R Packages

We want your feedback!

dtwclust Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole: TADPole clustering In dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole clustering

Description

Usage

Arguments

Details

Value

Parallel Computing

References

Related to TADPole in dtwclust...

R Package Documentation

Browse R Packages

We want your feedback!

dtwclust
Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance

TADPole: TADPole clustering
In dtwclust: Time Series Clustering Along with Optimizations for the Dynamic Time Warping Distance