BICC | R Documentation |
Apply the algorithm of unsupervised spatio-temporal clustering, TRUST
\insertCiteCiampi_etal_2010funtimes, with automatic selection of its
tuning parameters Delta
and Epsilon
based on Bayesian
information criterion, BIC \insertCiteSchaeffer_etal_2016_trustfuntimes.
BICC(X, Alpha = NULL, Beta = NULL, Theta = 0.8, p, w, s)
X |
a matrix of time series observed within a slide (time series in columns). |
Alpha |
lower limit of the time-series domain,
passed to |
Beta |
upper limit of the time-series domain passed to |
Theta |
connectivity parameter passed to |
p |
number of layers (time-series observations) in each slide. |
w |
number of slides in each window. |
s |
step to shift a window, calculated in the number of slides. The recommended
values are 1 (overlapping windows) or equal to |
This is the upper-level function for time series clustering.
It exploits the functions CWindowCluster
and
CSlideCluster
to cluster time series based on closeness and
homogeneity measures. Clustering is performed multiple times with a range
of equidistant values for the parameters Delta
and Epsilon
,
then optimal parameters Delta
and Epsilon
along with the
corresponding clustering results are shown
\insertCite@see @Schaeffer_etal_2016_trust, for more detailsfuntimes.
The total length of time series (number of levels, i.e., nrow(X)
)
should be divisible by p
.
A list with the following elements:
delta.opt |
optimal value for the clustering parameter |
epsilon.opt |
optimal value for the clustering parameter |
clusters |
vector of length |
IC |
values of the information criterion (BIC) for each considered
combination of |
delta.all |
vector of considered values for |
epsilon.all |
vector of considered values for |
Ethan Schaeffer, Vyacheslav Lyubchich
CSlideCluster
, CWindowCluster
, purity
# Fix seed for reproducible simulations:
set.seed(1)
##### Example 1
# Similar to Schaeffer et al. (2016), simulate 3 years of monthly data
#for 10 locations and apply clustering:
# 1.1 Simulation
T <- 36 #total months
N <- 10 #locations
phi <- c(0.5) #parameter of autoregression
burn <- 300 #burn-in period for simulations
X <- sapply(1:N, function(x)
arima.sim(n = T + burn,
list(order = c(length(phi), 0, 0), ar = phi)))[(burn + 1):(T + burn),]
colnames(X) <- paste("TS", c(1:dim(X)[2]), sep = "")
# 1.2 Clustering
# Assume that information arrives in year-long slides or data chunks
p <- 12 #number of time layers (months) in a slide
# Let the upper level of clustering (window) be the whole period of 3 years, so
w <- 3 #number of slides in a window
s <- w #step to shift a window, but it does not matter much here as we have only one window of data
tmp <- BICC(X, p = p, w = w, s = s)
# 1.3 Evaluate clustering
# In these simulations, it is known that all time series belong to one class,
#since they were all simulated the same way:
classes <- rep(1, 10)
# Use the information on the classes to calculate clustering purity:
purity(classes, tmp$clusters[1,])
##### Example 2
# 2.1 Modify time series and update classes accordingly:
# Add a mean shift to a half of the time series:
X2 <- X
X2[, 1:(N/2)] <- X2[, 1:(N/2)] + 3
classes2 <- rep(c(1, 2), each = N/2)
# 2.2 Re-apply clustering procedure and evaluate clustering purity:
tmp2 <- BICC(X2, p = p, w = w, s = s)
tmp2$clusters
purity(classes2, tmp2$clusters[1,])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.