CP_MTS: Estimating the matrix time series CP-factor model
In HDTSA: High Dimensional Time Series Analysis Tools

CP_MTS

R Documentation

Estimating the matrix time series CP-factor model

Description

CP_MTS() deals with the estimation of the CP-factor model for matrix time series:

{\bf{Y}}_t = {\bf A \bf X}_t{\bf B}' + {\boldsymbol{\epsilon}}_t,

where {\bf X}_t = {\rm diag}(x_{t,1},\ldots,x_{t,d}) is a d \times d unobservable diagonal matrix, {\boldsymbol{\epsilon}}_t is a p \times q matrix white noise, {\bf A} and {\bf B} are, respectively, p \times d and q \times d unknown constant matrices with their columns being unit vectors, and 1\leq d < \min(p,q) is an unknown integer. Let {\rm rank}(\mathbf{A}) = d_1 and {\rm rank}(\mathbf{B}) = d_2 with some unknown d_1,d_2\leq d. This function aims to estimate d, d_1, d_2 and the loading matrices {\bf A} and {\bf B} using the methods proposed in Chang et al. (2023) and Chang et al. (2024).

Usage

CP_MTS(
  Y,
  xi = NULL,
  Rank = NULL,
  lag.k = 20,
  lag.ktilde = 10,
  method = c("CP.Direct", "CP.Refined", "CP.Unified"),
  thresh1 = FALSE,
  thresh2 = FALSE,
  thresh3 = FALSE,
  delta1 = 2 * sqrt(log(dim(Y)[2] * dim(Y)[3])/dim(Y)[1]),
  delta2 = delta1,
  delta3 = delta1
)

Arguments

`Y`	An `n \times p \times q` array, where `n` is the number of observations of the `p \times q` matrix time series `\{{\bf Y}_t\}_{t=1}^n`.
`xi`	An `n \times 1` vector `\boldsymbol{\xi} = (\xi_1,\ldots, \xi_n)'`, where `\xi_t` represents a linear combination of `{\bf Y}_t`. If `xi = NULL` (the default), `\xi_{t}` is determined by the PCA method introduced in Section 5.1 of Chang et al. (2023). Otherwise, `xi` can be given by the users.
`Rank`	A list containing the following components: `d` representing the number of columns of `{\bf A}` and `{\bf B}`, `d1` representing the rank of `{\bf A}`, and `d2` representing the rank of `{\bf B}`. If set to `NULL` (default), `d`, `d_1`, and `d_2` will be estimated. Otherwise, they can be given by the users.
`lag.k`	The time lag `K` used to calculate the nonnegative definite matrices `\hat{\mathbf{M}}_1` and `\hat{\mathbf{M}}_2` when `method = "CP.Refined"` or `method = "CP.Unified"`: `\hat{\mathbf{M}}_1\ =\ \sum_{k=1}^{K} \hat{\bf \Sigma}_{k} \hat{\bf \Sigma}_{k}'\ \ {\rm and} \ \ \hat{\mathbf{M}}_2\ =\ \sum_{k=1}^{K} \hat{\bf \Sigma}_{k}' \hat{\bf \Sigma}_{k}\,,` where `\hat{\bf \Sigma}_{k}` is an estimate of the cross-covariance between `{\bf Y}_t` and `\xi_t` at lag `k`. See 'Details'. The default is 20.
`lag.ktilde`	The time lag `\tilde K` involved in the unified estimation method [See (16) in Chang et al. (2024)], which is used when `method = "CP.Unified"`. The default is 10.
`method`	A string indicating which CP-decomposition method is used. Available options include: `"CP.Direct"` (the default) for the direct estimation method [See Section 3.1 of Chang et al. (2023)], `"CP.Refined"` for the refined estimation method [See Section 3.2 of Chang et al. (2023)], and `"CP.Unified"` for the unified estimation method [See Section 4 of Chang et al. (2024)]. The validity of methods `"CP.Direct"` and `"CP.Refined"` depends on the assumption `d_1=d_2=d`. When `d_1,d_2 \leq d`, the method `"CP.Unified"` can be applied. See Chang et al. (2024) for details.
`thresh1`	Logical. If `FALSE` (the default), no thresholding will be applied in `\hat{\bf \Sigma}_{k}`, which indicates that the threshold level `\delta_1=0`. If `TRUE`, `\delta_1` will be set through `delta1`. `thresh1` is used for all three methods. See 'Details'.
`thresh2`	Logical. If `FALSE` (the default), no thresholding will be applied in `\check{\bf \Sigma}_{k}`, which indicates that the threshold level `\delta_2=0`. If `TRUE`, `\delta_2` will be set through `delta2`. `thresh2` is used only when `method = "CP.Refined"`. See 'Details'.
`thresh3`	Logical. If `FALSE` (the default), no thresholding will be applied in `\vec{\bf \Sigma}_{k}`, which indicates that the threshold level `\delta_3=0`. If `TRUE`, `\delta_3` will be set through `delta3`. `thresh3` is used only when `method = "CP.Unified"`. See 'Details'.
`delta1`	The value of the threshold level `\delta_1`. The default is `\delta_1 = 2 \sqrt{n^{-1}\log (pq)}`.
`delta2`	The value of the threshold level `\delta_2`. The default is `\delta_2 = 2 \sqrt{n^{-1}\log (pq)}`.
`delta3`	The value of the threshold level `\delta_3`. The default is `\delta_3 = 2 \sqrt{n^{-1}\log(pq)}`.

Details

All three CP-decomposition methods involve the estimation of the autocovariance of {\bf Y}_t and \xi_t at lag k, which is defined as follows:

\hat{\bf \Sigma}_{k} = T_{\delta_1}\{\hat{\boldsymbol{\Sigma}}_{\mathbf{Y}, \xi}(k)\}\ \ {\rm with}\ \ \hat{\boldsymbol{\Sigma}}_{\mathbf{Y}, \xi}(k) = \frac{1}{n-k} \sum_{t=k+1}^n(\mathbf{Y}_t-\bar{\mathbf{Y}})(\xi_{t-k}-\bar{\xi})\,,

where \bar{\bf Y} = n^{-1}\sum_{t=1}^n {\bf Y}_t, \bar{\xi}=n^{-1}\sum_{t=1}^n \xi_t and T_{\delta_1}(\cdot) is a threshold operator defined as T_{\delta_1}({\bf W}) = \{w_{i,j}1(|w_{i,j}|\geq \delta_1)\} for any matrix {\bf W}=(w_{i,j}), with the threshold level \delta_1 \geq 0 and 1(\cdot) representing the indicator function. Chang et al. (2023) and Chang et al. (2024) suggest to choose \delta_1 = 0 when p, q are fixed and \delta_1>0 when pq \gg n.

The refined estimation method involves

\check{\bf \Sigma}_{k} = T_{\delta_2}\{\hat{\mathbf{\Sigma}}_{\check{\mathbf{Y}}}(k)\}\ \ {\rm with} \ \ \hat{\mathbf{\Sigma}}_{\check{\mathbf{Y}}}(k)=\frac{1}{n-k} \sum_{t=k+1}^n(\mathbf{Y}_t-\bar{\mathbf{Y}}) \otimes {\rm vec} (\mathbf{Y}_{t-k}-\bar{\mathbf{Y}})\,,

where T_{\delta_2}(\cdot) is a threshold operator with the threshold level \delta_2 \geq 0, and {\rm vec}(\cdot) is a vecterization operator with {\rm vec}({\bf H}) being the (m_1m_2)\times 1 vector obtained by stacking the columns of the m_1 \times m_2 matrix {\bf H}. See Section 3.2.2 of Chang et al. (2023) for details.

The unified estimation method involves

\vec{\bf \Sigma}_{k}= T_{\delta_3}\{\hat{\boldsymbol{\Sigma}}_{\vec{\mathbf{Y}}}(k)\} \ \ {\rm with}\ \ \hat{\boldsymbol{\Sigma}}_{\vec{\mathbf{Y}}}(k)=\frac{1}{n-k} \sum_{t=k+1}^n{\rm vec}({\mathbf{Y}}_t-\bar{\mathbf{Y}})\{{\rm vec} (\mathbf{Y}_{t-k}-\bar{\mathbf{Y}})\}'\,,

where T_{\delta_3}(\cdot) is a threshold operator with the threshold level \delta_3 \geq 0. See Section 4.2 of Chang et al. (2024) for details.

Value

An object of class "mtscp", which contains the following components:

`A`	The estimated `p \times \hat{d}` left loading matrix `\hat{\bf A}`.
`B`	The estimated `q \times \hat{d}` right loading matrix `\hat{\bf B}`.
`f`	The estimated latent process `\hat{x}_{t,1},\ldots,\hat{x}_{t,\hat{d}}`.
`Rank`	The estimated `\hat{d}_1,\hat{d}_2`, and `\hat{d}`.
`method`	A string indicating which CP-decomposition method is used.

References

Chang, J., Du, Y., Huang, G., & Yao, Q. (2024). Identification and estimation for matrix time series CP-factor models. arXiv preprint. \Sexpr[results=rd]{tools:::Rd_expr_doi("doi:10.48550/arXiv.2410.05634")}.

Chang, J., He, J., Yang, L., & Yao, Q. (2023). Modelling matrix time series via a tensor CP-decomposition. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85, 127–148. \Sexpr[results=rd]{tools:::Rd_expr_doi("doi:10.1093/jrsssb/qkac011")}.

Examples

# Example 1.
p <- 10
q <- 10
n <- 400
d = d1 = d2 <- 3
## DGP.CP() generates simulated data for the example in Chang et al. (2024).
data <- DGP.CP(n, p, q, d, d1, d2)
Y <- data$Y

## d is unknown
res1 <- CP_MTS(Y, method = "CP.Direct")
res2 <- CP_MTS(Y, method = "CP.Refined")
res3 <- CP_MTS(Y, method = "CP.Unified")

## d is known
res4 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Direct")
res5 <- CP_MTS(Y, Rank = list(d = 3), method = "CP.Refined")


# Example 2.
p <- 10
q <- 10
n <- 400
d1 = d2 <- 2
d <-3
data <- DGP.CP(n, p, q, d, d1, d2)
Y1 <- data$Y

## d, d1 and d2 are unknown
res6 <- CP_MTS(Y1, method = "CP.Unified")
## d, d1 and d2 are known
res7 <- CP_MTS(Y1, Rank = list(d = 3, d1 = 2, d2 = 2), method = "CP.Unified")

HDTSA documentation built on April 3, 2025, 11:07 p.m.