Description Usage Arguments Details Value Parallel Computing Multivariate series Note References Examples

View source: R/CENTROIDS-dba.R

A global averaging method for time series under DTW (Petitjean, Ketterlin and Gancarski 2011).

1 2 3 4 5 6 7 | ```
DBA(X, centroid = NULL, ..., window.size = NULL, norm = "L1",
max.iter = 20L, delta = 0.001, error.check = TRUE, trace = FALSE,
mv.ver = "by-variable")
dba(X, centroid = NULL, ..., window.size = NULL, norm = "L1",
max.iter = 20L, delta = 0.001, error.check = TRUE, trace = FALSE,
mv.ver = "by-variable")
``` |

`X` |
A matrix or data frame where each row is a time series, or a list where each element is a time series. Multivariate series should be provided as a list of matrices where time spans the rows and the variables span the columns of each matrix. |

`centroid` |
Optionally, a time series to use as reference. Defaults to a random series of |

`...` |
Further arguments for |

`window.size` |
Window constraint for the DTW calculations. |

`norm` |
Norm for the local cost matrix of DTW. Either "L1" for Manhattan distance or "L2" for Euclidean distance. |

`max.iter` |
Maximum number of iterations allowed. |

`delta` |
At iteration |

`error.check` |
Logical indicating whether the function should try to detect inconsistencies and give more informative errors messages. Also used internally to avoid repeating checks. |

`trace` |
If |

`mv.ver` |
Multivariate version to use. See below. |

This function tries to find the optimum average series between a group of time series in DTW space. Refer to the cited article for specific details on the algorithm.

If a given series reference is provided in `centroid`

, the algorithm should always converge to
the same result provided the elements of `X`

keep the same values, although their order may
change.

The windowing constraint uses a centered window. The calculations expect a value in
`window.size`

that represents the distance between the point considered and one of the edges
of the window. Therefore, if, for example, `window.size = 10`

, the warping for an
observation *x_i* considers the points between *x_{i-10}* and *x_{i+10}*, resulting
in `10(2) + 1 = 21`

observations falling within the window.

The average time series.

Please note that running tasks in parallel does **not** guarantee faster computations. The
overhead introduced is sometimes too large, and it's better to run tasks sequentially.

This function uses the `RcppParallel`

package
for parallelization. It uses all available threads by default (see
`RcppParallel::defaultNumThreads()`

), but this can
be changed by the user with
`RcppParallel::setThreadOptions()`

.

An exception to the above is when this function is called within a
`foreach`

parallel loop **made by dtwclust**. If the parallel
workers do not have the number of threads explicitly specified, this function will default to 1
thread per worker. See the parallelization vignette for more information
(`browseVignettes("dtwclust")`

).

This function appears to be very sensitive to numerical inaccuracies if multi-threading is used
in a **32 bit** installation. In such systems, consider limiting calculations to 1 thread.

There are currently 2 versions of DBA implemented for multivariate series:

If

`mv.ver = "by-variable"`

, then each variable of each series in`X`

and`centroid`

are extracted, and the univariate version of the algorithm is applied to each set of variables, binding the results by column. Therefore, the DTW backtracking is different for each variable.If

`mv.ver = "by-series"`

, then all variables are considered at the same time, so the DTW backtracking is computed based on each multivariate series as a whole. This version was implemented in version 4.0.0 of dtwclust, and it is faster, but not necessarily more correct.

The indices of the DTW alignment are obtained by calling `dtw_basic()`

with `backtrack = TRUE`

.

Petitjean F, Ketterlin A and Gancarski P (2011). “A global averaging method for dynamic time
warping, with applications to clustering.” *Pattern Recognition*, **44**(3), pp. 678 - 693. ISSN
0031-3203, http://dx.doi.org/10.1016/j.patcog.2010.09.013,
http://www.sciencedirect.com/science/article/pii/S003132031000453X.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | ```
# Sample data
data(uciCT)
# Obtain an average for the first 5 time series
dtw_avg <- DBA(CharTraj[1:5], CharTraj[[1]], trace = TRUE)
# Plot
matplot(do.call(cbind, CharTraj[1:5]), type = "l")
points(dtw_avg)
# Change the provided order
dtw_avg2 <- DBA(CharTraj[5:1], CharTraj[[1]], trace = TRUE)
# Same result?
all.equal(dtw_avg, dtw_avg2)
## Not run:
# ====================================================================================
# Multivariate versions
# ====================================================================================
# sample centroid reference
cent <- CharTrajMV[[3L]]
# sample series
x <- CharTrajMV[[1L]]
# sample set of series
X <- CharTrajMV[1L:5L]
# the by-series version does something like this for each series and the centroid
alignment <- dtw_basic(x, cent, backtrack = TRUE)
# alignment$index1 and alginment$index2 indicate how to map x to cent (row-wise)
# the by-variable version treats each variable separately
alignment1 <- dtw_basic(x[,1L], cent[,1L], backtrack = TRUE)
alignment2 <- dtw_basic(x[,2L], cent[,2L], backtrack = TRUE)
alignment3 <- dtw_basic(x[,3L], cent[,3L], backtrack = TRUE)
# effectively doing:
X1 <- lapply(X, function(x) { x[,1L] })
X2 <- lapply(X, function(x) { x[,2L] })
X3 <- lapply(X, function(x) { x[,3L] })
dba1 <- dba(X1, cent[,1L])
dba2 <- dba(X2, cent[,2L])
dba3 <- dba(X3, cent[,3L])
new_cent <- cbind(dba1, dba2, dba3)
# sanity check
newer_cent <- dba(X, cent, mv.ver = "by-variable")
all.equal(newer_cent, new_cent, check.attributes = FALSE) # ignore names
## End(Not run)
``` |

```
Loading required package: proxy
Attaching package: 'proxy'
The following objects are masked from 'package:stats':
as.dist, dist
The following object is masked from 'package:base':
as.matrix
Loading required package: clue
Loading required package: dtw
Loaded dtw v1.18-1. See ?dtw for help, citation("dtw") for use in publication.
Loading required package: ggplot2
dtwclust:
Setting random number generator to L'Ecuyer-CMRG (see RNGkind()).
To read the included vignettes type: browseVignettes("dtwclust").
Please see news(package = "dtwclust") for important information.
DBA Iteration: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14 - Converged!
DBA Iteration: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14 - Converged!
[1] TRUE
[1] TRUE
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.