TSclust Dissimilarity Computation

Share:

Description

Computes the dissimilarity matrix of the given numeric matrix, list, data.frame or mts object using the selected TSclust dissimilarity method.

Usage

1
diss(SERIES, METHOD, ...)

Arguments

SERIES

Numeric matrix, list, data.frame or mts object. Numeric matrices are interpreted row-wise (one series per row) meanwhile data.frame and mts objects are interpredted column-wise.

METHOD

the dissimilarity measure to be used. This must be one of "ACF", "AR.LPC.CEPS", "AR.MAH", "AR.PIC", "CDM", "CID", "COR", "CORT", "DTWARP", "DWT", "EUCL", "FRECHET", INT.PER", "NCD", "PACF", "PDC", PER", "PRED", "MINDIST.SAX", "SPEC.LLR", "SPEC.GLK" or "SPEC.ISD". Any unambiguous substring can be given. See details for individual usage.

...

Additional arguments for the selected method.

Details

SERIES argument can be a numeric matrix, with one row per series, a list object with one numeric vector per element, a data.frame or a mts object. Some methods can have additional arguments. See the individual help page for each dissimilarity method, detailed below. Methods that have arguments that require one value per time series in series must provide so using a vector, a matrix (in the case of a multivalued argument) or a list when appropiate. In the case of a matrix, the values are conveyed row-wise. See the AR.LPC.CEPS example below.

  • "ACF" Autocorrelation-based method. See diss.ACF.

  • "AR.LPC.CEPS" Linear Predictive Coding ARIMA method. This method has two value-per-series arguments, the ARIMA order, and the seasonality.See diss.AR.LPC.CEPS.

  • "AR.MAH" Model-based ARMA method. See diss.AR.MAH.

  • "AR.PIC" Model-based ARMA method. This method has a value-per-series argument, the ARIMA order. See diss.AR.PIC.

  • "CDM" Compression-based dissimilarity method. See diss.CDM.

  • "CID" Complexity-Invariant distance. See diss.CID.

  • "COR" Correlation-based method. See diss.COR.

  • "CORT" Temporal Correlation and Raw values method. See diss.CORT.

  • "DTWARP" Dynamic Time Warping method. See diss.DTWARP.

  • "DWT" Discrete wavelet transform method. See diss.DWT.

  • "EUCL" Euclidean distance. See diss.EUCL. For many more convetional distances, see link[stats]{dist}, though you may need to transpose the dataset.

  • "FRECHET" Frechet distance. See diss.FRECHET.

  • "INT.PER" Integrate Periodogram-based method. See diss.INT.PER.

  • "NCD" Normalized Compression Distance. See diss.NCD.

  • "PACF" Partial Autocorrelation-based method. See diss.PACF.

  • "PDC" Permutation distribution divergence. Uses the pdc package. See pdcDist for additional arguments and details. Note that series given by numeric matrices are interpreted row-wise and not column-wise, opposite as in pdcDist.

  • "PER" Periodogram-based method. See diss.PER.

  • "PRED" Prediction Density-based method. This method has two value-per-series agument, the logarithm and difference transform. See diss.PRED.

  • "MINDIST.SAX" Distance that lower bounds the Euclidean, based on the Symbolic Aggregate approXimation measure. See diss.MINDIST.SAX.

  • "SPEC.LLR" Spectral Density by Local-Linear Estimation method. See diss.SPEC.LLR.

  • "SPEC.GLK" Log-Spectra Generalized Likelihood Ratio test method. See diss.SPEC.GLK.

  • "SPEC.ISD" Intregated Squared Differences between Log-Spectras method. See diss.SPEC.ISD.

Value

dist

A dist object with the pairwise dissimilarities between series.

Some methods produce additional output, see their respective documentation pages for more information.

Author(s)

Pablo Montero Manso, José Antonio Vilar.

References

Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i01/.

See Also

pdc, dist

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
data(electricity)
diss(electricity, METHOD="INT.PER", normalize=FALSE)

## Example of multivalued, one per series argument
## The AR.LPC.CEPS dissimilarity allows the specification of the ARIMA model for each series
## Create three sample time series and a mts object
x <- arima.sim(model=list(ar=c(0.4,-0.1)), n =100, n.start=100)
y <- arima.sim(model=list(ar=c(0.9)), n =100, n.start=100)
z <- arima.sim(model=list(ar=c(0.5, 0.2)), n =100, n.start=100)
seriests <- rbind(x,y,z)

## If we want to provide the ARIMA order for each series
## and use it with AR.LPC.CEPS, we create a matrix with the row-wise orders
orderx <- c(2,0,0) 
ordery <- c(1,0,0)
orderz <- c(2,0,0)
orders = rbind(orderx, ordery, orderz)

diss( seriests, METHOD="AR.LPC.CEPS", k=30, order= orders )

##other examples
diss( seriests, METHOD="MINDIST.SAX", w=10, alpha=4 )
diss( seriests, METHOD="PDC" )

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.