correlationSquaredDecomp: Compute SVD of squared correlation matrix

Description Usage Arguments Details Value Examples

Description

Given the SVD of C, compute the SVD of C^2

Usage

1

Arguments

V

eigen vectors of correlation matrix

d

*singular* values of correlation matrix

rank

use the first 'rank' singular vectors from the SVD. Using increasing 'rank' will increase the accuracy of the estimation. But now that the computationaly complexity is O(P choose(rank, 2)), where P is the number of features in the dataset

Details

Consider a data matrix X_N x P of P features and N samples where N << P. Let the columns of X be scaled so that C_P x P = XX^T. C is often too big to compute directly since it is O(P^2) and O(P^3) to invert. But we can compute the SVD of X in O(PN^2). The goal is to compute the SVD of the matrix C^2, given only the SVD of C in less than O(P^2 time). Here we compute this SVD of C^2 in O(PN^4) time, which is tractible for small N. Moreover, if we use an SVD X = UDV^T with of rank R, we can approximate the SVD of C^2 in O(PR^4) using only D and V In practice, this can be reduced to O(P (choose(N,2) + N)^4)

Value

compute the SVD of C^2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
N = 50 # samples 
P = 200 # features

# Simulate feature matrix
X = matrix(rnorm(N*P), N, P)

# Scale feature matrix
X = scale(X) / sqrt(N-1)

# Compute SVD of feature matrix
dcmp = svd(X, nu=0)

# Compute correlation and squared correlation matrices
# This is O(P^2)
C = crossprod(X) 
Csq = C^2

# Compute SVD of Csq using only the svd of C
# this is faster than O(PN^4)
# if R is the rank of X
# Time is O(P (choose(N,2) + N)^4)
dcmp_C2 = correlationSquaredDecomp( dcmp$v, dcmp$d )

GabrielHoffman/pinnacle documentation built on May 3, 2019, 3:02 p.m.