CWindowCluster: Window-Level Time Series Clustering

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/CWindowCluster.R

Description

Cluster time series at a window level, based on Algorithm 2 of \insertCiteCiampi_etal_2010;textualfuntimes.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
CWindowCluster(
  X,
  Alpha = NULL,
  Beta = NULL,
  Delta = NULL,
  Theta = 0.8,
  p,
  w,
  s,
  Epsilon = 1
)

Arguments

X

a matrix of time series observed within a slide (time series in columns).

Alpha

lower limit of the time series domain, passed to CSlideCluster.

Beta

upper limit of the time series domain, passed to CSlideCluster.

Delta

closeness parameter, passed to CSlideCluster.

Theta

connectivity parameter, passed to CSlideCluster.

p

number of layers (time series observations) in each slide.

w

number of slides in each window.

s

step to shift a window, calculated in number of slides. The recommended values are 1 (overlapping windows) or equal to w (non-overlapping windows).

Epsilon

a real value in [0,1] used to identify each pair of time series that are clustered together over at least w*Epsilon slides within a window; see Definition 7 by \insertCiteCiampi_etal_2010;textualfuntimes. Default is 1.

Details

This is the upper-level function for time series clustering. It exploits the function CSlideCluster to cluster time series within each slide based on closeness and homogeneity measures. Then, it uses slide-level cluster assignments to cluster time series within each window.

The total length of time series (number of levels, i.e., nrow(X)) should be divisible by p.

Value

A vector (if X contains only one window) or matrix with cluster labels for each time series (columns) and window (rows).

Author(s)

Vyacheslav Lyubchich

References

\insertAllCited

See Also

CSlideCluster, CWindowCluster, and BICC

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#For example, weekly data come in slides of 4 weeks
p <- 4 #number of layers in each slide (data come in a slide)
    
#We want to analyze the trend clusters within a window of 1 year
w <- 13 #number of slides in each window
s <- w  #step to shift a window

#Simulate 26 autoregressive time series with two years of weekly data (52*2 weeks), 
#with a 'burn-in' period of 300.
N <- 26
T <- 2*p*w
    
set.seed(123) 
phi <- c(0.5) #parameter of autoregression
X <- sapply(1:N, function(x) arima.sim(n = T + 300, 
     list(order = c(length(phi), 0, 0), ar = phi)))[301:(T + 300),]
colnames(X) <- paste("TS", c(1:dim(X)[2]), sep = "")
 
tmp <- CWindowCluster(X, Delta = NULL, Theta = 0.8, p = p, w = w, s = s, Epsilon = 1)

#Time series were simulated with the same parameters, but based on the clustering parameters,
#not all time series join the same cluster. We can plot the main cluster for each window, and 
#time series out of the cluster:
par(mfrow = c(2, 2))
ts.plot(X[c(1:(p*w)), tmp[1,] == 1], ylim = c(-4, 4), 
        main = "Time series cluster 1 in window 1")
ts.plot(X[c(1:(p*w)), tmp[1,] != 1], ylim = c(-4, 4), 
        main = "The rest of the time series in window 1")
ts.plot(X[c(1:(p*w)) + s*p, tmp[2,] == 1], ylim = c(-4, 4), 
        main = "Time series cluster 1 in window 2")
ts.plot(X[c(1:(p*w)) + s*p, tmp[2,] != 1], ylim = c(-4, 4), 
        main = "The rest of the time series in window 2")

funtimes documentation built on Nov. 28, 2020, 1:06 a.m.