sync_cluster: Time Series Clustering based on Trend Synchronism
In vlyubchich/funtimes: Functions for Time Series Analysis

sync_cluster

R Documentation

Time Series Clustering based on Trend Synchronism

Description

Cluster time series with a common parametric trend using the sync_test function \insertCiteLyubchich_Gel_2016_synchronism,Ghahari_etal_2017_MBDCEfuntimes.

Usage

sync_cluster(formula, rate = 1, alpha = 0.05, ...)

Arguments

`formula`	an object of class "`formula`", specifying the type of common trend for clustering the time series in a `T` by `N` matrix of time series (time series in columns) which is passed to `sync_test`. Variable `t` should be used to specify the form of the trend, where `t` is specified within the function automatically as a regular sequence of length `T` on the interval (0,1]. See `Examples`.
`rate`	rate of removal of time series. Default is 1 (i.e., if the hypothesis of synchronism is rejected one time series is removed at a time to re-test the remaining time series). Integer values above 1 are treated as the number of time series to be removed. Values from 0 to 1 are treated as a fraction of the time series to be removed.
`alpha`	significance level for testing the hypothesis of a common trend (using `sync_test`) of the parametric form specified in the `formula`.
`...`	arguments to be passed to `sync_test`, for example, number of bootstrap replications (`B`).

Details

The sync_cluster function recursively clusters time series having a pre-specified common parametric trend until there is no time series left. Starting with the given N time series, the sync_test function is used to test for a common trend. If the null hypothesis of common trend is not rejected by sync_test, the time series are grouped (i.e., assigned to a cluster). Otherwise, the time series with the largest contribution to the test statistics are temporarily removed (the number of time series to remove depends on the rate of removal), and sync_test is applied again. The contribution to the test statistic is assessed by the WAVK test statistic calculated for each time series.

Value

A list with the elements:

`cluster`	an integer vector indicating the cluster to which each time series is allocated. A label `'0'` is assigned to time series which do not have a common trend with other time series (that is, all time series labeled with `'0'` are separate one-element clusters).
`elements`	a list with names of the time series in each cluster.

The further elements combine results of sync_test for each cluster with at least two elements (that is, single-element clusters labeled with '0' are excluded):

`estimate`	a list with common parametric trend estimates obtained by `sync_test` for each cluster. The length of this list is `max(cluster)`.
`pval`	a list of `p`-values of `sync_test` for each cluster. The length of this list is `max(cluster)`.
`statistic`	a list with values of `sync_test` test statistic for each cluster. The length of this list is `max(cluster)`.
`ar_order`	a list of AR filter orders used in `sync_test` for each time series. The results are grouped by cluster in the list of length `max(cluster)`.
`window_used`	a list of local windows used in `sync_test` for each time series. The results are grouped by cluster in the list of length `max(cluster)`.
`all_considered_windows`	a list of all windows considered in `sync_test` and corresponding test results, for each cluster. The length of this list is `max(cluster)`.
`WAVK_obs`	a list of WAVK test statistics obtained in `sync_test` for each time series. The results are grouped by cluster in the list of length `max(cluster)`.

Author(s)

Srishti Vishwakarma, Vyacheslav Lyubchich

References

\insertAllCited

Examples

## Not run: 
## Simulate 4 autoregressive time series, 
## 3 having a linear trend and 1 without a trend:
set.seed(123)
T = 100 #length of time series
N = 4 #number of time series
X = sapply(1:N, function(x) arima.sim(n = T, 
           list(order = c(1, 0, 0), ar = c(0.6))))
X[,1] <- 5 * (1:T)/T + X[,1]
plot.ts(X)

# Finding clusters with common linear trends:
LinTrend <- sync_cluster(X ~ t) 
  
## Sample Output:
##[1] "Cluster labels:"
##[1] 0 1 1 1
##[1] "Number of single-element clusters (labeled with '0'): 1"

## plotting the time series of the cluster obtained
for(i in 1:max(LinTrend$cluster)) {
    plot.ts(X[, LinTrend$cluster == i], 
            main = paste("Cluster", i))
}


## Simulating 7 autoregressive time series, 
## where first 4 time series have a linear trend added 
set.seed(234)
T = 100 #length of time series
a <- sapply(1:4, function(x) -10 + 0.1 * (1:T) + 
            arima.sim(n = T, list(order = c(1, 0, 0), ar = c(0.6))))
b <- sapply(1:3, function(x) arima.sim(n = T, 
            list(order = c(1, 0, 0), ar = c(0.6))))
Y <- cbind(a, b)
plot.ts(Y)

## Clustering based on linear trend with rate of removal = 2 
# and confidence level for the synchronism test 90%
LinTrend7 <- sync_cluster(Y ~ t, rate = 2, alpha = 0.1, B = 99)
   
## Sample output:
##[1] "Cluster labels:"
##[1] 1 1 1 0 2 0 2
##[1] "Number of single-element clusters (labeled with '0'): 2"

## End(Not run)

vlyubchich/funtimes documentation built on May 6, 2023, 3:21 a.m.