Tensor EVD
In tensorEVD: A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

knitr::opts_chunk$set(cache=FALSE)
library(tensorEVD)

Definition

Let the $n\times n$ matrix $\textbf{K}$ to be a Hadamard product involving two smaller symmetric positive semi-definite (e.g., covariance structures) matrices $\textbf{K}_1$ and $\textbf{K}_2$ of dimensions $n_1\times n_1$ and $n_2\times n_2$, respectively,

$$ \textbf{K} = (\textbf{Z}_1 \textbf{K}_1 \textbf{Z}'_1) \odot (\textbf{Z}_2 \textbf{K}_2 \textbf{Z}'_2) $$

where $\textbf{Z}_1$ and $\textbf{Z}_2$ are incidence matrices mapping from rows (and columns) of the resulting Hadamard $\textbf{K}$ to rows (and columns) of $\textbf{K}_1$ and $\textbf{K}_2$, respectively.

Let the eigenvalue decomposition (EVD) of $\textbf{K}_1$ and $\textbf{K}_2$ to be $\textbf{K}_1 = \textbf{V}_1 \textbf{D}_1 \textbf{V}'_1$ and $\textbf{K}_2 = \textbf{V}_2 \textbf{D}_2 \textbf{V}'_2$. Using properties of the Hadamard and Kronecker products, an EVD of the Hadamard product $\textbf{K}$ can be derived from the EVD of the corresponding Kronecker product between $\textbf{K}_1$ and $\textbf{K}_2$ as

$$ \begin{align} \textbf{K} &= \tilde{\textbf{V}} \textbf{D} \tilde{\textbf{V}}' \ &= (\textbf{Z}_1\star \textbf{Z}_2)\textbf{V}\textbf{D}\textbf{V}'(\textbf{Z}_1\star \textbf{Z}_2)' \end{align} $$

where $\textbf{V} = \textbf{V}_1\otimes \textbf{V}_2 = [\boldsymbol{v}_1,\dots,\boldsymbol{v}_N]$ and $\textbf{D} = \textbf{D}_1\otimes \textbf{D}_2 = diag(d_1,\dots,d_N)$ are Kronecker products of the eigenvectors and eigenvalues, respectively, of $\textbf{K}_1$ and $\textbf{K}_2$. It can be shown that these correspond to the $N = n_1\times n_2$ eigenvectors and eigenvalues of the Kronecker product $\textbf{K}_1\otimes\textbf{K}_2$; therefore, the columns of $\textbf{V}$ are orthonormal vectors, i.e., $\textbf{V}'\textbf{V} = \textbf{I}_N$, and the elements of $\textbf{D}$ are such that $d_1 \ge \dots \ge d_N \ge 0$. The term $\textbf{Z}_1\star\textbf{Z}_2$ is an $n\times N$ matrix obtained as the "face-splitting product" (aka "transposed Khatri–Rao product") of matrices $\textbf{Z}_1$ and $\textbf{Z}_2$, which is defined as a row-by-row Kronecker product

$$ \textbf{Z}1\star\textbf{Z}_2 = \begin{pmatrix} \boldsymbol{z}{11}\otimes\boldsymbol{z}{12} \ \boldsymbol{z}{21}\otimes\boldsymbol{z}{22}\ \vdots \ \boldsymbol{z}{n1}\otimes\boldsymbol{z}_{n2} \end{pmatrix} $$

with $\boldsymbol{z}{i1}$ and $\boldsymbol{z}{i2}$ being the $i^{th}$ row of $\textbf{Z}_1$ and $\textbf{Z}_2$, respectively.

Implementation

The tensorEVD() function derives the decomposition of the Hadamard product $\textbf{K} = \tilde{\textbf{V}} \textbf{D} \tilde{\textbf{V}}'$ formed from two matrices $\textbf{K}1$ and $\textbf{K}_2$. The matrix $\tilde{\textbf{V}} = (\textbf{Z}_1\star\textbf{Z}_2)\textbf{V} = [\tilde{\boldsymbol{v}}_1,\dots,\tilde{\boldsymbol{v}}_N]$ is efficiently computed by deriving each column as a Hadamard product ('$\odot$') using the corresponding $i_k^{th}$ and $j_k^{th}$ eigenvectors $\boldsymbol{v}{1i_k}$ and $\boldsymbol{v}_{2j_k}$ of $\textbf{V}_1$ and $\textbf{V}_2$, respectively, that form the $k^{th}$ eigenvector $\boldsymbol{v}_k$ of $\textbf{V}$, this is

$$ \tilde{\boldsymbol{v}}k = (\textbf{Z}_1 \boldsymbol{v}{1i_k}) \odot (\textbf{Z}2 \boldsymbol{v}{2j_k}) $$

The terms $\textbf{Z}1 \boldsymbol{v}{1i_k}$ and $\textbf{Z}2 \boldsymbol{v}{2j_k}$ can be obtained by matrix indexing using integer vectors ID1 and ID2 of length $n$, for instance, as v1ik[ID1] and v2jk[ID2]. The entries of ID1 and ID2 are the row (and column) number of $\textbf{K}_1$ and $\textbf{K}_2$ that are mapped at each row of $\textbf{Z}_1$ and $\textbf{Z}_2$, respectively. Therefore, the tensorEVD() can be implemented using as inputs the covariance structure matrices and IDs, e.g.,

tensorEVD(K1, K2, ID1, ID2)

Examples

1. Balanced design

When $\textbf{Z}_1$ and $\textbf{Z}_2$ are such that the Hadamard $\textbf{K}$ represent a Kronecker product between $\textbf{K}_1$ and $\textbf{K}_2$ (i.e, $n = n_1\times n_2$ combinations, each element of $\textbf{K}_1$ crossed with one and only one element of $\textbf{K}_2$), then the results of the tensorEVD() are exactly the same as those obtained by performing the EVD directly on $\textbf{K}$ using the eigen() function from the 'base' R-package.

# Simulating covariance matrices K1 and K2
n1 = 10; n2 = 15
K1 <- crossprod(matrix(rnorm(n1*(n1+10),sd=sqrt(1/n1)), ncol=n1))
K2 <- crossprod(matrix(rnorm(n2*(n2+10),sd=sqrt(1/n2)), ncol=n2))

ID1 <- rep(seq(n1), each=n2)
ID2 <- rep(seq(n2), times=n1)

# Direct EVD of the Hadamard product
K <- Hadamard(K1, K2, ID1, ID2, ID1, ID2)   # Same as K = K1[ID1,ID1]*K2[ID2,ID2]
EVD0 <- eigen(K)

# Tensor EVD using K1 and K2
EVD <- tensorEVD(K1, K2, ID1, ID2)

# Eigenvalues and (absolute) eigenvectors and are numerically equal
all.equal(EVD0$values, EVD$values)
all.equal(abs(EVD0$vectors), abs(EVD$vectors))

2. Unbalanced design

For unbalanced (i.e., combinations appearing at different frequencies) designs, eigenvectors and eigenvalues are no longer equivalent:

n <- n1*n2   # size of the Hadamard
ID1 <- sample(seq(n1), n, replace=TRUE) # Randomly sample of ID1
ID2 <- sample(seq(n2), n, replace=TRUE) # Randomly sample of ID2

K <- Hadamard(K1, K2, ID1, ID2, ID1, ID2)
EVD0 <- eigen(K)
EVD <- tensorEVD(K1, K2, ID1, ID2)

all.equal(EVD0$values, EVD$values)
all.equal(abs(EVD0$vectors), abs(EVD$vectors))

However, tensorEVD() will still produce a total sum of eigenvalues always equal to the $trace(\textbf{K})$ and will provide the same approximation $\textbf{K} = \tilde{\textbf{V}} \textbf{D} \tilde{\textbf{V}}'$

# Sum of eigenvalues
c(sum(EVD0$values), sum(EVD$values), sum(diag(K)))

# Approximation for K
K01 <- EVD0$vectors%*%diag(EVD0$values)%*%t(EVD0$vectors)
K02 <- EVD$vectors%*%diag(EVD$values)%*%t(EVD$vectors)
c(all.equal(K,K01), all.equal(K,K02))

Dimension and rank

The set of eigenvectors with positive eigenvalue are the only ones needed to span the Hadamard matrix $\textbf{K} = (\textbf{Z}_1 \textbf{K}_1 \textbf{Z}'_1) \odot (\textbf{Z}_2 \textbf{K}_2 \textbf{Z}'_2)$. The size of this set (i.e., the rank of $\textbf{K}$) is at most the minimum between $n$ and $n_1\times n_2$. The tensorEVD() algorithm produces the complete basis containing $n_1\times n_2$ eigenvectors for the Kronecker matrix product $\textbf{K}_1\otimes \textbf{K}_2$.

As consequence, tensorEVD() can provide more vectors than the ones needed to span $\textbf{K}$ if the size of the Hadamard product ($n$) is considerably smaller than the corresponding Kronecker product ($n_1\times n_2$). For example,

n = n1*n2/2    # size of the Hadamard is half of n1 x n2
ID1 <- sample(seq(n1), n, replace=TRUE)
ID2 <- sample(seq(n2), n, replace=TRUE)

K <- Hadamard(K1, K2, ID1, ID2, ID1, ID2)
EVD0 <- eigen(K)
EVD <- tensorEVD(K1, K2, ID1, ID2)

# Number of eigenvectors with positive eigenvalue
c(eigen=sum(EVD0$values>1E-10), tensorEVD=sum(EVD$values>1E-10))

However, when $n$ is larger than $n_1\times n_2$, both approaches provide similar number of eigenvectors. For the balanced replicated case, the number of eigenvectors will be the same:

# Size of the Hadamard is three times n1 x n2
# Balanced and replicated case
ID1 <- rep(rep(seq(n1), each=n2), 3)
ID2 <- rep(rep(seq(n2), times=n1), 3)

K <- Hadamard(K1, K2, ID1, ID2, ID1, ID2)
EVD0 <- eigen(K)
EVD <- tensorEVD(K1, K2, ID1, ID2)

c(eigen=sum(EVD0$values>1E-10), tensorEVD=sum(EVD$values>1E-10))

Proportion of variance explained

Instead of forming all possible eigenvectors, tensorEVD() allows to specify a proportion of variance explained ($0\lt\alpha\leq 1$) and build only the eigenvectors needed to achieve such proportion of variance. For example,

alpha <- 0.95
EVD <- tensorEVD(K1, K2, ID1, ID2, alpha=alpha)
ncol(EVD$vectors)

# For the direct EVD
varexp = cumsum(EVD0$values/sum(EVD0$values))
index = 1:which.min(abs(varexp-alpha))
ncol(EVD0$vectors[,index])

Extras

Making dimension names

Row and column names for the eigenvectors of the Hadamard product can be retrieved using the make.dimnames argument. Attribute rownames of the eigenvectors will be produced by crossing rownames of $\textbf{K}_1$ with those of $\textbf{K}_2$. Attribute colnames will contain the cross between the eigenvector position of the EVD of $\textbf{K}_1$ with those of $\textbf{K}_1$ forming each eigenvector of the Hadamard product. For instance,

dimnames(K1) <- list(paste0("i",seq(n1)), paste0("i",seq(n1)))
dimnames(K2) <- list(paste0("j",seq(n2)), paste0("j",seq(n2)))

EVD <- tensorEVD(K1, K2, ID1, ID2, make.dimnames=TRUE)
EVD$vectors[1:6,1:5]

Pre-calculated EVD

Pre-calculated EVD can be also provided to the function, so, I will not be calculated again.

EVD2 <- eigen(K2)
EVD0 <- tensorEVD(K1=K1, EVD2=EVD2, ID1=ID1, ID2=ID2)
EVD <- tensorEVD(K1=K1, K2=K2, ID1=ID1, ID2=ID2)

all.equal(EVD0$values, EVD$values)
all.equal(abs(EVD0$vectors), abs(EVD$vectors))

Any scripts or data that you put into this service are public.

tensorEVD documentation built on Sept. 11, 2024, 9:02 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tensorEVD
A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

Tensor EVD
In tensorEVD: A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

Definition

Implementation

Examples

1. Balanced design

2. Unbalanced design

Dimension and rank

Proportion of variance explained

Extras

Making dimension names

Pre-calculated EVD

Try the tensorEVD package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tensorEVD A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

Tensor EVD In tensorEVD: A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

Definition

Implementation

Examples

1. Balanced design

2. Unbalanced design

Dimension and rank

Proportion of variance explained

Extras

Making dimension names

Pre-calculated EVD

Try the tensorEVD package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tensorEVD
A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices

Tensor EVD
In tensorEVD: A Fast Algorithm to Factorize High-Dimensional Tensor Product Matrices