# tPCAaug: Order Determination for Tensorial PCA Using Augmentation In tensorBSS: Blind Source Separation Methods for Tensor-Valued Observations

## Description

In a tensorial PCA context the dimensions of a core tensor are estimated based on augmentation of additional noise components. Information from both eigenvectors and eigenvalues are then used to obtain the dimension estimates.

## Usage

 ```1 2``` ```tPCAaug(x, noise = "median", naug = 1, nrep = 1, sigma2 = NULL, alpha = NULL) ```

## Arguments

 `x` array of an order at least three with the last dimension corresponding to the sampling units. `noise` specifies how to estimate the noise variance. Can be one of `"median"`, `"quantile"`, `"last"`, `"known"`. Default is `"median"`. See details for further information. `naug` number of augmented variables in each mode. Default is 1. `nrep` number of repetitions for the augmentation. Default is 1. `sigma2` if `noise = "known"` the value of the noise variance. `alpha` if `noise = "quantile"` this specifies the quantile to be used.

## Details

For simplicity details are given for matrix valued observations.

Assume having a sample of p_1 x p_2 matrix-valued observations which are realizations of the model X = U_L Z U_R'+ N, where U_L and U_R are matrices with orthonormal columns, Z is the random, zero mean k_1 x k_2 core matrix with k_1 <= p_1 and k_2 <= p_2. N is p_1 x p_2 matrix-variate noise that follows a matrix variate spherical distribution with E(N) = 0 and E(N N') = sigma^2 I_p_1 and is independent from Z. The goal is to estimate k_1 and k_2. For that purpose the eigenvalues and eigenvectors of the left and right covariances are used. To evaluate the variation in the eigenvectors, in each mode the matrix X is augmented with `naug` normally distributed components appropriately scaled by noise standard deviation. The procedure can be repeated `nrep` times to reduce random variation in the estimates.

The procedure needs an estimate of the noise variance and four options are available via the argument `noise`:

1. `noise = "median"`: Assumes that at least half of components are noise and uses thus the median of the pooled and scaled eigenvalues as an estimate.

2. `noise = "quantile"`: Assumes that at least 100 `alpha` % of the components are noise and uses the mean of the lower `alpha` quantile of the pooled and scaled eigenvalues from all modes as an estimate.

3. `noise = "last"`: Uses the pooled information from all modes and then the smallest eigenvalue as estimate.

4. `noise = "known"`: Assumes the error variance is known and needs to be provided via `sigma2`.

## Value

A list of class 'taug' inheriting from class 'tladle' and containing:

 `U` list containing the modewise rotation matrices. `D` list containing the modewise eigenvalues. `S` array of the same size as `x` containing the principal components. `ResMode` a list with the modewise results which are lists containing: modelabel for the mode. kthe order estimated for that mode. fnvector giving the measures of variation of the eigenvectors. phinnormalized eigenvalues. lambdathe unnormalized eigenvalues used to compute phin. gnthe main criterion augmented order estimator. compvector from 0 to the number of dimensions to be evaluated. `xmu` the data location `data.name` string with the name of the input data `method` string `tPCA`. `Sigma2` estimate of standardized sigma2 from the model described above or the standardized provided value. Sigma2 is the estimate for the variance of individual entries of `N`. `AllSigHat2` vector of noise variances used for each mode.

## References

Radojicic, U., Lietzen, N., Nordhausen, K. and Virta, J. (2021), On order determinaton in 2D PCA. Manuscript.

`tPCA`, `tPCAladle`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55``` ```library(ICtest) # matrix-variate example n <- 200 sig <- 0.6 Z <- rbind(sqrt(0.7)*rt(n,df=5)*sqrt(3/5), sqrt(0.3)*runif(n,-sqrt(3),sqrt(3)), sqrt(0.3)*(rchisq(n,df=3)-3)/sqrt(6), sqrt(0.9)*(rexp(n)-1), sqrt(0.1)*rlogis(n,0,sqrt(3)/pi), sqrt(0.5)*(rbeta(n,2,2)-0.5)*sqrt(20) ) dim(Z) <- c(3, 2, n) U1 <- rorth(12)[,1:3] U2 <- rorth(8)[,1:2] U <- list(U1=U1, U2=U2) Y <- tensorTransform2(Z,U,1:2) EPS <- array(rnorm(12*8*n, mean=0, sd=sig), dim=c(12,8,n)) X <- Y + EPS TEST <- tPCAaug(X) # Dimension should be 3 and 2 and (close to) sigma2 0.36 TEST # Noise variance in i-th mode is equal to Sigma2 multiplied by the product # of number of colums of all modes except i-th one TEST\$Sigma2*c(8,12) # This is returned as TEST\$AllSigHat2 # higher order tensor example Z2 <- rnorm(n*3*2*4*10) dim(Z2) <- c(3,2,4,10,n) U2.1 <- rorth(12)[ ,1:3] U2.2 <- rorth(8)[ ,1:2] U2.3 <- rorth(5)[ ,1:4] U2.4 <- rorth(20)[ ,1:10] U2 <- list(U1 = U2.1, U2 = U2.2, U3 = U2.3, U4 = U2.4) Y2 <- tensorTransform2(Z2, U2, 1:4) EPS2 <- array(rnorm(12*8*5*20*n, mean=0, sd=sig), dim=c(12, 8, 5, 20, n)) X2 <- Y2 + EPS2 TEST2 <- tPCAaug(X2) TEST2 ```