incRpca.rc: Incremental PCA With Reduced Complexity
In onlinePCA: Online Principal Component Analysis

incRpca.rc

R Documentation

Incremental PCA With Reduced Complexity

Description

The incremental PCA is computed without rotating the updated projection space (Brand, 2002; Arora et al., 2012). Specifically, PCs are specified through a matrix of orthogonal vectors Ut that spans the PC space and a rotation matrix Us such that the PC matrix is UtUs. Given a new data vector, the PCA is updated by adding one column to Ut and recalculating the low-dimensional rotation matrix Us. This reduces complexity and helps preserving orthogonality. Eigenvalues are updated as the usual incremental PCA algorithm.

Usage

incRpca.rc(lambda, Ut, Us, x, n, f = 1/n, center, tol = 1e-07)

Arguments

`lambda`	vector of eigenvalues.
`Ut`	matrix of orthogonal vectors stored in columns.
`Us`	rotation matrix.
`x`	new data vector.
`n`	sample size before observing `x`.
`f`	forgetting factor: a number in (0,1).
`center`	optional centering vector for `x`.
`tol`	numerical tolerance for eigenvalues.

Details

For large datasets, this algorithm is considerably faster than its counterpart incRpca, reducing the time complexity of each update from O(qd^2) to O(qd + q^3) flops with d the length of x. A consequence of not rotating the PC basis at each update is that the dimension of the PCA decomposition increases whenever a new observation vector is not entirely contained in the PC space. To keep the number of PCs and eigenvalues from getting too large, it is necessary to multiply the matrices U_t and U_s at regular time intervals so as to recover the individual PCs and retain only the largest ones.

Value

A list with components

`values`	updated eigenvalues in decreasing order.
`Ut`	updated projection space.
`Us`	updated rotation matrix.

References

Arora et al. (2012). Stochastic Optimization for PCA and PLS. 50th Annual Conference on Communication, Control, and Computing (Allerton).
Brand, M. (2002). Incremental singular value decomposition of uncertain data with missing values. European Conference on Computer Vision (ECCV).

Examples

## Not run: 
# Data generation
n <- 400 # number of units
d <- 10000 # number of variables
n0 <- 200 # initial sample
q <- 20 # required number of PCs
x <- matrix(rnorm(n*d,sd=1/sqrt(d)),n,d) # data matrix
x <- t(apply(x,1,cumsum)) # standard Brownian motion

# Initial PCA
# Initial PCA
xbar0 <- colMeans(x[1:n0,])
pca0 <- batchpca(x0c, q, center=xbar0, byrow=TRUE)

# Incremental PCA with rotation
xbar <- xbar0
pca1 <- pca0
system.time({
for (i in n0:(n-1)) {
	xbar <- updateMean(xbar, x[i+1,], i)
	pca1 <- incRpca(pca1$values, pca1$vectors, x[i+1,], i, center=xbar)
	}
})

# Incremental PCA without rotation
xbar <- xbar0
pca2 <- list(values=pca0$values, Ut=pca0$vectors, Us=diag(q))

system.time({
for (i in n0:(n-1)) {
	xbar <- updateMean(xbar, x[i+1,], i)
	pca2 <- incRpca.rc(pca2$values, pca2$Ut, pca2$Us, x[i+1,], 
		i, center = xbar)
	# Rotate the PC basis and reduce its size to q every k observations
	if (i %% q == 0 || i == n-1) 
		{ pca2$values <- pca2$values[1:q]
		  pca2$Ut <- pca2$Ut %*% pca2$Us[,1:q]
		  pca2$Us <- diag(q)
	}
}
})

# Check that the results are identical
# Relative differences in eigenvalues
range(pca1$values/pca2$values-1)
# Cosines of angles between eigenvectors 
abs(colSums(pca1$vectors * pca2$Ut))

## End(Not run)

onlinePCA documentation built on Nov. 15, 2023, 9:07 a.m.