| clust_cp | R Documentation |
The clust_cp function cluster observations with common change points. Data can be time series or survival functions.
clust_cp(
data,
n_iterations,
n_burnin = 0,
params = list(),
alpha_SM = 1,
B = 1000,
L = 1,
q = 0.5,
kernel,
print_progress = TRUE,
user_seed = 1234
)
data |
a matrix or an array If a matrix the algorithm for univariate time series is used, where each row is a time series. If an array, the algorithm is run for multivariate time series. Each slice of the array is a matrix where the rows are the dimensions of the time series. |
n_iterations |
number of MCMC iterations. |
n_burnin |
number of iterations that must be excluded when computing the posterior estimate. |
params |
a list of parameters: If the time series is univariate the following must be specified:
If the time series is multivariate the following must be specified:
If data are survival functions:
|
alpha_SM |
|
B |
number of orders for the normalization constant. |
L |
number of split-merge steps for the proposal step. |
q |
probability of a split in the split-merge proposal and acceleration step. |
kernel |
can be "ts" if data are time series or "epi" if data are survival functions. |
print_progress |
If TRUE (default) print the progress bar. |
user_seed |
seed for random distribution generation. |
A ClustCpObj class object containing
$data vector or matrix with the data.
$n_iterations number of iterations.
$n_burnin number of burn-in iterations.
$clust a matrix where each row corresponds to the output cluster of the corresponding iteration.
$orders a multidimensional array where each slice is a matrix and represent an iteration. The row of each matrix correspond the order associated to the corresponding cluster.
$time computational time.
$lkl a matrix where each row is the likelihood of each observation computed at the corresponding iteration.
$norm_vec a vector containing the normalization constant computed at the beginning of the algorithm.
I0_MCMC traceplot for I_0.
I0_MCMC_01 a 0/1 vector, the n-th element is equal to 1 if the proposed I_0 was accepted, 0 otherwise.
$kernel_ts if TRUE data are time series.
$kernel_epi if TRUE data are survival function.
$univariate_ts TRUE if data is an univariate time series, FALSE if it is a multivariate time series.
Corradin, R., Danese, L., KhudaBukhsh, W. R., & Ongaro, A. (2024). Model-based clustering of time-dependent observations with common structural changes. arXiv preprint arXiv:2410.09552.
## Univariate time series
data_mat <- matrix(NA, nrow = 5, ncol = 100)
data_mat[1,] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_mat[2,] <- as.numeric(c(rnorm(50,0,0.125), rnorm(50,1,0.225)))
data_mat[3,] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_mat[4,] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_mat[5,] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
out <- clust_cp(data = data_mat, n_iterations = 5000, n_burnin = 1000,
L = 1, params = list(phi = 0.5), B = 1000, kernel = "ts")
print(out)
## Multivariate time series
data_array <- array(data = NA, dim = c(3,100,5))
data_array[1,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[2,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[3,,1] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[1,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[2,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[3,,2] <- as.numeric(c(rnorm(50,0,0.100), rnorm(50,1,0.250)))
data_array[1,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[2,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[3,,3] <- as.numeric(c(rnorm(50,0,0.175), rnorm(50,1,0.280)))
data_array[1,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[2,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[3,,4] <- as.numeric(c(rnorm(25,0,0.135), rnorm(75,1,0.225)))
data_array[1,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
data_array[2,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
data_array[3,,5] <- as.numeric(c(rnorm(25,0,0.155), rnorm(75,1,0.280)))
out <- clust_cp(data = data_array, n_iterations = 3000, n_burnin = 1000,
params = list(phi = 0.5, k_0 = 0.25,
nu_0 = 5, S_0 = diag(0.1,3,3),
m_0 = rep(0,3)), B = 1000, kernel = "ts")
print(out)
## Epidemiological data
data_mat <- matrix(NA, nrow = 5, ncol = 50)
betas <- list(c(rep(0.45, 25),rep(0.14,25)),
c(rep(0.55, 25),rep(0.11,25)),
c(rep(0.50, 25),rep(0.12,25)),
c(rep(0.52, 10),rep(0.15,40)),
c(rep(0.53, 10),rep(0.13,40)))
inf_times <- list()
for(i in 1:5){
inf_times[[i]] <- sim_epi_data(10000, 10, 50, betas[[i]], 1/8)
vec <- rep(0,50)
names(vec) <- as.character(1:50)
for(j in 1:50){
if(as.character(j) %in% names(table(floor(inf_times[[i]])))){
vec[j] = table(floor(inf_times[[i]]))[which(names(table(floor(inf_times[[i]]))) == j)]
}
}
data_mat[i,] <- vec
}
out <- clust_cp(data = data_mat, n_iterations = 100, n_burnin = 10,
params = list(M = 100, xi = 1/8), B = 1000, kernel = "epi")
print(out)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.