twdtw | R Documentation |
This function calculates the Time-Weighted Dynamic Time Warping (TWDTW) distance between two time series.
twdtw(x, y, time_weight, cycle_length, time_scale, ...)
## S3 method for class 'data.frame'
twdtw(
x,
y,
time_weight,
cycle_length,
time_scale,
origin = NULL,
index_column = "time",
max_elapsed = Inf,
output = "distance",
version = "f90",
...
)
## S3 method for class 'matrix'
twdtw(
x,
y,
time_weight,
cycle_length,
time_scale = NULL,
index_column = 1,
max_elapsed = Inf,
output = "distance",
version = "f90",
...
)
x |
A data.frame or matrix representing time series. |
y |
A data.frame or matrix representing a labeled time series (reference). |
time_weight |
A numeric vector with length two (steepness and midpoint of logistic weight) or a function. See details. |
cycle_length |
The length of the cycle. Can be a numeric value or a string specifying the units ('year', 'month', 'day', 'hour', 'minute', 'second'). When numeric, the cycle length is in the same units as time_scale. When a string, it specifies the time unit of the cycle. |
time_scale |
Specifies the time scale for the conversion. Must be one of 'year', 'month', 'day', 'hour', 'minute', 'second'. When cycle_length is a string, time_scale changes the unit in which the result is expressed. When cycle_length is numeric, time_scale is used to compute the elapsed time in seconds. |
... |
ignore |
origin |
For numeric cycle_length, the origin must be specified. This is the point from which the elapsed time is computed. Must be of the same class as x. |
index_column |
(optional) The column name of the time index for data.frame inputs. Defaults to "time". For matrix input, an integer indicating the column with the time index. Defaults to 1. |
max_elapsed |
Numeric value constraining the TWDTW calculation to the lower band given by a maximum elapsed time. Defaults to Inf. |
output |
A character string defining the output. It must be one of 'distance', 'matches', 'internals'. Defaults to 'distance'.
'distance' will return the lowest TWDTW distance between |
version |
A string identifying the version of TWDTW implementation. Options are 'f90' for Fortran 90, 'f90goto' for Fortran 90 with goto statements, or 'cpp' for C++ version. Defaults to 'f90'. See details. |
TWDTW calculates a time-weighted version of DTW by modifying each element of the
DTW's local cost matrix (see details in Maus et al. (2016) and Maus et al. (2019)).
The default time weight is calculated using a logistic function
that adds a weight to each pair of observations in the time series x
and y
based on the time difference between observations, such that
tw(dist_{i,j}) = dist_{i,j} + \frac{1}{1 + e^{-{\alpha} (el_{i,j} - {\beta})}}
Where:
tw
is the time-weight function
dist_{i,j}
is the Euclidean distance between the i-th element of x
and the j-th element of y
in a multi-dimensional space
el_{i,j}
is the time elapsed between the i-th element of x
and the j-th element of y
\alpha
and \beta
are the steepness and midpoint of the logistic function, respectively
The logistic function is implemented as the default option in the C++ and Fortran versions of the code.
To use the native implementation, \alpha
and \beta
must be provided as a numeric vector of
length two using the argument time_weight
. This implementation provides high processing performance.
The time_weight
argument also accepts a function defined in R, allowing the user to define a different
weighting scheme. However, passing a function to time_weight
can degrade the processing performance,
i.e., it can be up to 3x slower than using the default logistic time-weight.
A time-weight function passed to time_weight
must receive two numeric arguments and return a
single numeric value. The first argument received is the Euclidean dist_{i,j}
and the second
is the elapsed time el_{i,j}
. For example,
time_weight = function(dist, el) dist + 0.1*el
defines a linear weighting scheme with a slope of 0.1.
The Fortran 90 versions of twdtw
are usually faster than the C++ version.
The 'f90goto
' version, which uses goto statements, is slightly quicker than the
'f90
' version that uses while and for loops. You can use the max_elapsed
parameter
to limit the TWDTW calculation to a maximum elapsed time. This means it will skip
comparisons between pairs of observations in x
and y
that are far apart in time.
Be careful, though: if max_elapsed
is set too low, it could change the results.
It important to try out different settings for your specific problem.
An S3 object twdtw either: If output = 'distance', a numeric value representing the TWDTW distance between the two time series. If output = 'matches', a numeric matrix of all TWDTW matches. For each match the starting index, ending index, and distance are returned. If output = 'internals', a list of all TWDTW internal data is returned.
Maus, V., Camara, G., Cartaxo, R., Sanchez, A., Ramos, F. M., & de Moura, Y. M. (2016). A Time-Weighted Dynamic Time Warping Method for Land-Use and Land-Cover Mapping. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(8), 3729-3739. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/JSTARS.2016.2517118")}
Maus, V., Camara, G., Appel, M., & Pebesma, E. (2019). dtwSat: Time-Weighted Dynamic Time Warping for Satellite Image Time Series Analysis in R. Journal of Statistical Software, 88(5), 1-31. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v088.i05")}
# Create a time series
n <- 23
t <- seq(0, pi, length.out = n)
d <- seq(as.Date('2020-09-01'), length.out = n, by = "15 day")
x <- data.frame(time = d, v1 = sin(t)*2 + runif(n))
# shift time by 30 days
y <- data.frame(time = d + 30, v1 = sin(t)*2 + runif(n))
plot(x, type = "l", xlim = range(c(d, d + 5)))
lines(y, col = "red")
# Calculate TWDTW distance between x and y using logistic weight
twdtw(x, y,
cycle_length = 'year',
time_scale = 'day',
time_weight = c(steepness = 0.1, midpoint = 50))
# Pass a generic time-weight function
twdtw(x, y,
cycle_length = 'year',
time_scale = 'day',
time_weight = function(x,y) x + 1.0 / (1.0 + exp(-0.1 * (y - 50))))
# Test other version
twdtw(x, y,
cycle_length = 'year',
time_scale = 'day',
time_weight = c(steepness = 0.1, midpoint = 50),
version = 'f90goto')
twdtw(x, y,
cycle_length = 'year',
time_scale = 'day',
time_weight = c(steepness = 0.1, midpoint = 50),
version = 'cpp')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.