TEMM: Fit the Tensor Envelope Mixture Model (TEMM)
In azuryee/TensorClustering: Model-Based Tensor Clustering

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/TEMM.R

Fit the Tensor Envelope Mixture Model (TEMM)

1 2	TEMM(Xn, u, K, initial = "kmeans", iter.max = 500, stop = 1e-3, trueY = NULL, print = FALSE)

`Xn`	The tensor for clustering, should be array type, the last dimension is the sample size `n`.
`u`	A vector of envelope dimension
`K`	Number of clusters, greater than or equal to `2`.
`initial`	Initialization meth0d for the regularized EM algorithm. Default value is "kmeans".
`iter.max`	Maximum number of iterations. Default value is `500`.
`stop`	Convergence threshold of relative change in cluster means. Default value is `1e-3`.
`trueY`	A vector of true cluster labels of each observation. Default value is NULL.
`print`	Whether to print information including current iteration number, relative change in cluster means and clustering error (`%`) in each iteration.

The TEMM function fits the Tensor Envelope Mixture Model (TEMM) through a subspace-regularized EM algorithm. For mode m, let (\bm{Γ}_m,\bm{Γ}_{0m})\in R^{p_m\times p_m} be an orthogonal matrix where \bm{Γ}_{m}\in R^{p_{m}\times u_{m}}, u_{m}≤q p_{m}, represents the material part. Specifically, the material part \mathbf{X}_{\star,m}=\mathbf{X}\times_{m}\bm{Γ}_{m}^{T} follows a tensor normal mixture distribution, while the immaterial part \mathbf{X}_{\circ,m}=\mathbf{X}\times_{m}\bm{Γ}_{0m}^{T} is unimodal, independent of the material part and hence can be eliminated without loss of clustering information. Dimension reduction is achieved by focusing on the material part \mathbf{X}_{\star,m}=\mathbf{X}\times_{m}\bm{Γ}_{m}^{T}. Collectively, the joint reduction from each mode is

\mathbf{X}_{\star}=[\![\mathbf{X};\bm{Γ}_{1}^{T},…,\bm{Γ}_{M}^{T}]\!]\sim∑_{k=1}^{K}π_{k}\mathrm{TN}(\bm{α}_{k};\bm{Ω}_{1},…,\bm{Ω}_{M}),\quad \mathbf{X}_{\star}\perp\!\!\!\perp\mathbf{X}_{\circ,m},

where \bm{α}_{k}\in R^{u_{1}\times\cdots\times u_{M}} and \bm{Ω}_m\in R^{u_m\times u_m} are the dimension-reduced clustering parameters and \mathbf{X}_{\circ,m} does not vary with cluster index Y. In the E-step, the membership weights are evaluated as

\widehat{η}_{ik}^{(s)}=\frac{\widehat{π}_{k}^{(s-1)}f_{k}(\mathbf{X}_i;\widehat{\bm{θ}}^{(s-1)})}{∑_{k=1}^{K}\widehat{π}_{k}^{(s-1)}f_{k}(\mathbf{X}_i;\widehat{\bm{θ}}^{(s-1)})},

where f_k denotes the conditional probability density function of \mathbf{X}_i within the k-th cluster. In the subspace-regularized M-step, the envelope subspace is iteratively estimated through a Grassmann manifold optimization that minimize the following log-likelihood-based objective function:

G_m^{(s)}(\bm{Γ}_m) = \log|\bm{Γ}_m^T \mathbf{M}_m^{(s)} \bm{Γ}_m|+\log|\bm{Γ}_m^T (\mathbf{N}_m^{(s)})^{-1} \bm{Γ}_m|,

where \mathbf{M}_{m}^{(s)} and \mathbf{N}_{m}^{(s)} are given by

\mathbf{M}_m^{(s)} = \frac{1}{np_{-m}}∑_{i=1}^{n} ∑_{k=1}^{K}\widehat{η}_{ik}^{(s)} (\bm{ε}_{ik}^{(s)})_{(m)}(\widehat{\bm{Σ}}_{-m}^{(s-1)})^{-1} (\bm{ε}_{ik}^{(s)})_{(m)}^T,

\mathbf{N}_m^{(s)} = \frac{1}{np_{-m}}∑_{i=1}^{n} (\mathbf{X}_i)_{(m)}(\widehat{\bm{Σ}}_{-m}^{(s-1)})^{-1}(\mathbf{X}_i)_{(m)}^T.

The intermediate estimators \mathbf{M}_{m}^{(s)} can be viewed the mode-m conditional variation estimate of \mathbf{X}\mid Y and \mathbf{N}_{m}^{(s)} is the mode-m marginal variation estimate of \mathbf{X}.

`id`	A vector of estimated labels.
`pi`	A vector of estimated prior probabilities for clusters.
`eta`	A `n` by `K` matrix of estimated membership weights.
`Mu.est`	A list of estimated cluster means.
`SIG.est`	A list of estimated covariance matrices.
`Mm`	Estimation of `Mm` defined in paper.
`Nm`	Estimation of `Nm` defined in paper.
`Gamma.est`	A list of estimated envelope basis.
`PGamma.est`	A list of envelope projection matrices.