HDSReg: Factor analysis with observed regressors for vector time...
In HDTSA: High Dimensional Time Series Analysis Tools

HDSReg

R Documentation

Factor analysis with observed regressors for vector time series

Description

HDSReg() considers a multivariate time series model which represents a high-dimensional vector process as a sum of three terms: a linear regression of some observed regressors, a linear combination of some latent and serially correlated factors, and a vector white noise:

{\bf y}_t = {\bf Dz}_t + {\bf Ax}_t + {\boldsymbol {\epsilon}}_t,

where {\bf y}_t and {\bf z}_t are, respectively, observable p\times 1 and m \times 1 time series, {\bf x}_t is an r \times 1 latent factor process, {\boldsymbol{\epsilon}}_t is a vector white noise process, {\bf D} is an unknown regression coefficient matrix, and {\bf A} is an unknown factor loading matrix. This procedure proposed in Chang, Guo and Yao (2015) aims to estimate the regression coefficient matrix {\bf D}, the number of factors r and the factor loading matrix {\bf A}.

Usage

HDSReg(
  Y,
  Z,
  D = NULL,
  lag.k = 5,
  thresh = FALSE,
  delta = 2 * sqrt(log(ncol(Y))/nrow(Y)),
  twostep = FALSE
)

Arguments

`Y`	An `n \times p` data matrix `{\bf Y} = ({\bf y}_1, \dots , {\bf y}_n )'`, where `n` is the number of the observations of the `p \times 1` time series `\{{\bf y}_t\}_{t=1}^n`.
`Z`	An `n \times m` data matrix `{\bf Z} = ({\bf z}_1, \dots , {\bf z}_n )'` consisting of the observed regressors.
`D`	A `p\times m` regression coefficient matrix `\tilde{\bf D}`. If `D = NULL` (the default), our procedure will estimate `{\bf D}` first and let `\tilde{\bf D}` be the estimate of `{\bf D}`. If `D` is given by the users, then `\tilde{\bf D}={\bf D}`.
`lag.k`	The time lag `K` used to calculate the nonnegative definte matrix `\hat{\mathbf{M}}_{\eta}`: `\hat{\mathbf{M}}_{\eta}\ =\ \sum_{k=1}^{K} T_\delta\{\hat{\mathbf{\Sigma}}_{\eta}(k)\} T_\delta\{\hat{\mathbf{\Sigma}}_{\eta}(k)\}',` where `\hat{\bf \Sigma}_{\eta}(k)` is the sample autocovariance of `{\boldsymbol {\eta}}_t = {\bf y}_t - \tilde{\bf D}{\bf z}_t` at lag `k` and `T_\delta(\cdot)` is a threshold operator with the threshold level `\delta \geq 0`. See 'Details'. The default is 5.
`thresh`	Logical. If `thresh = FALSE` (the default), no thresholding will be applied to estimate `\hat{\mathbf{M}}_{\eta}`. If `thresh = TRUE`, `\delta` will be set through `delta`. See 'Details'.
`delta`	The value of the threshold level `\delta`. The default is `\delta = 2 \sqrt{n^{-1}\log p}`.
`twostep`	Logical. The same as the argument `twostep` in `Factors`.

Details

The threshold operator T_\delta(\cdot) is defined as T_\delta({\bf W}) = \{w_{i,j}1(|w_{i,j}|\geq \delta)\} for any matrix {\bf W}=(w_{i,j}), with the threshold level \delta \geq 0 and 1(\cdot) representing the indicator function. We recommend to choose \delta=0 when p is fixed and \delta>0 when p \gg n.

Value

An object of class "factors", which contains the following components:

`factor_num`	The estimated number of factors `\hat{r}`.
`reg.coff.mat`	The estimated `p \times m` regression coefficient matrix `\tilde{\bf D}`.
`loading.mat`	The estimated `p \times \hat{r}` factor loading matrix `{\bf \hat{A}}`.
`X`	The `n\times \hat{r}` matrix `\hat{\bf X}=(\hat{\bf x}_1,\dots,\hat{\bf x}_n)'` with `\hat{\mathbf{x}}_t=\hat{\mathbf{A}}'(\mathbf{y}_t-\tilde{\mathbf{D}} \mathbf{z}_t)`.
`lag.k`	The time lag used in function.

References

Chang, J., Guo, B., & Yao, Q. (2015). High dimensional stochastic regression with latent factors, endogeneity and nonlinearity. Journal of Econometrics, 189, 297–312. \Sexpr[results=rd]{tools:::Rd_expr_doi("doi:10.1016/j.jeconom.2015.03.024")}.

Examples

# Example 1 (Example 1 in Chang, Guo and Yao (2015)).
## Generate xt
n <- 400
p <- 200
m <- 2
r <- 3
X <- mat.or.vec(n,r)
x1 <- arima.sim(model = list(ar = c(0.6)), n = n)
x2 <- arima.sim(model = list(ar = c(-0.5)), n = n)
x3 <- arima.sim(model = list(ar = c(0.3)), n = n)
X <- cbind(x1, x2, x3)
X <- t(X)

## Generate yt
Z <- mat.or.vec(m,n)
S1 <- matrix(c(5/8, 1/8, 1/8, 5/8), 2, 2)
Z[,1] <- c(rnorm(m))
for(i in c(2:n)){
  Z[,i] <- S1%*%Z[, i-1] + c(rnorm(m))
}
D <- matrix(runif(p*m, -2, 2), ncol = m)
A <- matrix(runif(p*r, -2, 2), ncol = r)
eps <- mat.or.vec(n, p)
eps <- matrix(rnorm(n*p), p, n)
Y <- D %*% Z + A %*% X + eps
Y <- t(Y)
Z <- t(Z)

## D is known
res1 <- HDSReg(Y, Z, D, lag.k = 2)
## D is unknown
res2 <- HDSReg(Y, Z, lag.k = 2)

HDTSA documentation built on April 3, 2025, 11:07 p.m.