dtw: Dynamic Time Warping

View source: R/dtw.R

dtwR Documentation

Dynamic Time Warping

Description

Calculate the DTW distance, cost matrices and direction matrices including the warping path two multivariate time series.

Usage

dtw(Q, C, dist_method = c("norm1", "norm2", "norm2_square"), 
    step_pattern = c("symmetric2", "symmetric1"), ws = NULL,
    return_cm = FALSE,
    return_diffM = FALSE,
    return_wp = FALSE,
    return_diffp = FALSE,
    return_QC = FALSE)
                     
cm(Q, C, dist_method = c("norm1", "norm2", "norm2_square"), 
   ws = NULL, ...)
   
## S3 method for class 'idtw'
print(x, digits = getOption("digits"), ...)

## S3 method for class 'idtw'
summary(object, ...)

is.idtw(x)

Arguments

Q

Query time series. Q needs to be one of the following: (1) a one dimensional vector, (2) a matrix where each row is one observations and each column is one dimension of the time series, or (3) a matrix of differences/ costs (diffM, cm). If Q and C are matrices they need to have the same number of columns.

C

Candidate time series. C needs to be one of the following: (1) a one dimensional vector, (2) a matrix where each row is one observations and each column is one dimension of the time series, or (3) if Q is a matrix of differences or costs C needs to be the respective character string 'diffM' or 'cm'.

dist_method

character, describes the method of distance measure for multivariate time series (this parameter is ignored for univariate time series). Currently supported methods are 'norm1' (default, is the Manhattan distance), 'norm2' (is the Euclidean distance) and 'norm2_square'. For the function cm() the parameter dist_method can also be a user defined distance function (see details and examples).

step_pattern

character, describes the step pattern. Currently implemented are the patterns symmetric1 and symmetric2, see details.

ws

integer, describes the window size for the sakoe chiba window. If NULL, then no window is applied. (default = NULL)

return_cm

logical, if TRUE then the Matrix of costs (the absolute value) is returned. (default = FALSE)

return_diffM

logical, if TRUE then the Matrix of differences (not the absolute value) is returned. (default = FALSE)

return_wp

logical, if TRUE then the warping path is returned. (default = FALSE) If return_diffp == TRUE, then return_wp is set to TRUE as well.

return_diffp

logical, if TRUE then the path of differences (not the absolute value) is returned. (default = FALSE)

return_QC

logical, if TRUE then the input vectors Q and C are appended to the returned list. This is useful for the plot.idtw function. (default = FALSE)

x

output from dtw or idtw.

object

any R object

...

additional arguments, e.g. passed to print, summary, or a user defined distance function for cm()

digits

passed to round and print

Details

The dynamic time warping distance is the element in the last row and last column of the global cost matrix.

For the multivariate case where Q is a matrix of n rows and k columns and C is a matrix of m rows and k columns the dist_method parameter defines the following distance measures:

norm1:

dist(Q_{i,.}, C_{j,.}) = ∑ {l = 1:k} |Q_{i,l} - C_{j,l}|

norm2:

dist(Q_{i,.}, C_{j,.}) = (∑ {l = 1:k} (Q_{i,l} - C_{j,l})^2)^0.5

norm2_square:

dist(Q_{i,.}, C_{j,.}) = ∑{l = 1:k} (Q_{i,l} - C_{j,l})^2

The parameter step_pattern describes how the two time series are aligned. If step_pattern == "symmetric1" then

gcm_{i,j} = cm{i,j} + min(gcm_{i-1,j}, gcm{i-1, j-1}, gcm{i, i-1}

.

If step_pattern == "symmetric2" then

gcm_{i,j} = cm{i,j} + min(gcm_{i-1,j}, cm{i,j}+ gcm{i-1, j-1}, gcm{i, i-1}

.

To make DTW distances comparable for many time series of different lengths use the normlized_distance with the setting step_method = 'symmetric2'. Please find a more detailed discussion and further references here: http://dtw.r-forge.r-project.org/.

User defined distance function: To calculate the DTW distance measure of two time series a distance function for the local distance of two observations Q[i, ] and C[j, ] of the time series Q and C has to be selected. The predefined distance function are 'norm1', 'norm2' and 'norm2-square'. It is also possible to define a customized distance function and use the cost matrix cm as input for the DTW algorithm, also for the incremental functions. In the following experiment we apply the cosine distance as local distance function:

d_cos(C_i, Q_j) = 1 - (∑{o=1:O} Q_{io} * C_{jo})/ ((∑{o=1:O} Q_{io}^2)^0.5 * (∑{o=1:O} C_{jo}^2)^0.5).

Value

distance

the DTW distance, that is the element of the last row and last column of gcm

normalized_distance

the normalized DTW distance, that is the distance divided by N+M, where N and M are the lengths of the time series Q and C, respectively. If step_pattern == 'symmetric1' no normalization is performed and NA is returned (see details).

gcm

global cost matrix

dm

direction matrix (3=up, 1=diagonal, 2=left)

wp

warping path

ii

indices of Q of the optimal path

jj

indices of C of the optimal path

cm

Matrix of costs

diffM

Matrix of differences

diffp

path of differences

Q

input Q

C

input C

References

  • Leodolter, M.; Pland, C.; Brändle, N; IncDTW: An R Package for Incremental Calculation of Dynamic Time Warping. Journal of Statistical Software, 99(9), 1-23. doi: 10.18637/jss.v099.i09

  • Sakoe, H.; Chiba, S., Dynamic programming algorithm optimization for spoken word recognition, Acoustics, Speech, and Signal Processing [see also IEEE Transactions on Signal Processing], IEEE Transactions on , vol.26, no.1, pp. 43-49, Feb 1978. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1163055

Examples


#--- univariate
Q <- cumsum(rnorm(100))
C <- Q[11:100] + rnorm(90, 0, 0.5)
tmp <- dtw(Q = Q, C = C, ws = 15, return_diffM = FALSE, 
           return_QC = TRUE, return_wp = TRUE)
names(tmp)
print(tmp, digits = 3)
plot(tmp)
plot(tmp, type = "warp")


#--- compare different input variations
dtw_base  <- dtw(Q = Q, C = C, ws = 15, return_diffM = TRUE)
dtw_diffM <- dtw(Q = dtw_base$diffM, C = "diffM", ws = 15, 
                 return_diffM = TRUE)
dtw_cm    <- dtw(Q = abs(dtw_base$diffM), C = "cm", ws = 15, 
                 return_diffM = TRUE)

identical(dtw_base$gcm, dtw_cm$gcm)
identical(dtw_base$gcm, dtw_diffM$gcm)

# of course no diffM is returned in the 'cm'-case
dtw_cm$diffM


#--- multivariate case
Q <- matrix(rnorm(100), ncol=2)
C <- matrix(rnorm(80), ncol=2)
dtw(Q = Q, C = C, ws = 30, dist_method = "norm2")


#--- user defined distance function
# We define the distance function d_cos and use it as input for the cost matrix function cm. 
# We can pass the output from cm() to dtw2vec(), and also to idtw2vec() for the 
# incrermental calculation:

d_cos <- function(x, y){ 
  1 - sum(x * y)/(sqrt(sum(x^2)) * sqrt(sum(y^2)))
}

Q <- matrix(rnorm(100), ncol=5, nrow=20)
C <- matrix(rnorm(150), ncol=5, nrow=30)
cm1 <- cm(Q, C, dist_method = d_cos)
dtw2vec(Q = cm1, C = "cm")$distance

res0 <- idtw2vec(Q = cm1[ ,1:20], newObs =  "cm")
idtw2vec(Q = cm1[ ,21:30], newObs =  "cm", gcm_lc = res0$gcm_lc_new)$distance

# The DTW distances -- based on the customized distance function -- of the 
# incremental calculation and the one from scratch are identical.



IncDTW documentation built on March 18, 2022, 6:43 p.m.