DEEM: Doubly-enhanced EM algorithm
In azuryee/TensorClustering: Model-Based Tensor Clustering

Description Usage Arguments Details Value Author(s) References See Also Examples

Doubly-enhanced EM algorithm for tensor clustering

1
2
3

DEEM(X, nclass, niter = 100, lambda = NULL, dfmax = n, pmax = nvars, pf = rep(1, nvars),
eps = 1e-04, maxit = 1e+05, sml = 1e-06, verbose = FALSE, ceps = 0.1,
initial = TRUE, vec_x = NULL)

`X`	Input tensor (or matrix) list of length n, where n is the number of observations. Each element of the list is a tensor or matrix. The order of tensor can be any positive integer not less than 2.
`nclass`	Number of clusters.
`niter`	Maximum iteration times for EM algorithm. Default value is 100.
`lambda`	A user-specified `lambda` value. `lambda` is the weight of L1 penalty and a smaller `lambda` allows more variables to be nonzero
`dfmax`	The maximum number of selected variables in the model. Default is the number of observations `n`.
`pmax`	The maximum number of potential selected variables during iteration. In middle step, the algorithm can select at most `pmax` variables and then shrink part of them such that the number of final selected variables is less than `dfmax`.
`pf`	Weight of lasso penalty. Default is a vector of value `1` and length `p`, representing L1 penalty of length p. Can be modified to use adaptive lasso penalty.
`eps`	Convergence threshold for coordinate descent difference between iterations. Default value is `1e-04`.
`maxit`	Maximum iteration times for coordinate descent for all lambda. Default value is `1e+05`.
`sml`	Threshold for ratio of loss function change after each iteration to old loss function value. Default value is `1e-06`.
`verbose`	Indicates whether print out lambda during iteration or not. Default value is `FALSE`.
`ceps`	Convergence threshold for cluster mean difference between iterations. Default value is `1`.
`initial`	Whether to initialize algorithm with K-means clustering. Default value is `TRUE`.
`vec_x`	Vectorized tensor data. Default value is `NULL`

The DEEM function implements the Doubly-Enhanced EM algorithm (DEEM) for tensor clustering. The observations \mathbf{X}_i are assumed to be following the tensor normal mixture model (TNMM) with common covariances across different clusters:

\mathbf{X}_i\sim∑_{k=1}^Kπ_k \mathrm{TN}(\bm{μ}_k;\bm{Σ}_1,…,\bm{Σ}_M),\quad i=1,…,n,

where 0<π_k<1 is the prior probability for \mathbf{X} to be in the k-th cluster such that ∑_{k=1}^{K}π_k=1, \bm{μ}_k is the cluster mean of the k-th cluster and \bm{Σ}_1,…,\bm{Σ}_M) are the common covariances across different clusters. Under the TNMM framework, the optimal clustering rule can be showed as

\widehat{Y}^{opt}=\arg\max_k\{\logπ_k+\langle\mathbf{X}-(\bm{μ}_1+\bm{μ}_k)/2,\mathbf{B}_k\rangle\},

where \mathbf{B}_k=[\![\bm{μ}_k-\bm{μ}_1;\bm{Σ}_1^{-1},…,\bm{Σ}_M^{-1}]\!]. In the enhanced E-step, DEEM imposes sparsity directly on the optimal clustering rule as a flexible alternative to popular low-rank assumptions on tensor coefficients \mathbf{B}_k as

\min_{\mathbf{B}_2,…,\mathbf{B}_K}\bigg[∑_{k=2}^K(\langle\mathbf{B}_k,[\![\mathbf{B}_k,\widehat{\bm{Σ}}_1^{(t)},…,\widehat{\bm{Σ}}_M^{(t)}]\!]\rangle-2\langle\mathbf{B}_k,\widehat{\bm{μ}}_k^{(t)}-\widehat{\bm{μ}}_1^{(t)}\rangle) +λ^{(t+1)}∑_{\mathcal{J}}√{∑_{k=2}^Kb_{k,\mathcal{J}}^2}\bigg],

where λ^{(t+1)} is a tuning parameter. In the enhanced M-step, DEEM employs a new estimator for the tensor correlation structure, which facilitates both the computation and the theoretical studies.

`pi`	A vector of estimated prior probabilities for clusters.
`mu`	A list of estimated cluster means.
`sigma`	A list of estimated covariance matrices.
`gamma`	A `n` by `nclass` matrix of estimated membership weights.
`y`	A vector of estimated labels.
`iter`	Number of iterations until convergence.
`df`	Average zero elements in beta over iterations.
`beta`	A matrix of vectorized `B_k`.

Kai Deng, Yuqing Pan, Xin Zhang and Qing Mai

Mai, Q., Zhang, X., Pan, Y. and Deng, K. (2021). A Doubly-Enhanced EM Algorithm for Model-Based Tensor Clustering. Journal of the American Statistical Association.

tune_lamb, tune_K

dimen = c(5,5,5)
nvars = prod(dimen)
K = 2
n = 100
sigma = array(list(),3)

sigma[[1]] = sigma[[2]] = sigma[[3]] = diag(5)

B2=array(0,dim=dimen)
B2[1:3,1,1]=2

y = c(rep(1,50),rep(2,50))
M = array(list(),K)
M[[1]] = array(0,dim=dimen)
M[[2]] = B2

vec_x=matrix(rnorm(n*prod(dimen)),ncol=n)
X=array(list(),n)
for (i in 1:n){
  X[[i]] = array(vec_x[,i],dim=dimen)
  X[[i]] = M[[y[i]]] + X[[i]]
}

myfit = DEEM(X, nclass=2, lambda=0.05)