Generalized Principal Component Analysis
Description
Computes the rank K
Generalized PCA (GPCA) solution.
Usage
1 
Arguments
X 
The 
Q 
The row generalizing operator, an 
R 
The column generalizing operator, an 
K 
The number of GPCA components to compute. The default value is one. 
deflation 
Algorithm used to calculate the solution. Default is

Details
The Generalized PCA solution maximizes the sample variance of the data
in an innerproduct space induced by the row and column generalizing
operators, Q
and R
, and also finds the best lowrank
approximation to the data as
measured by a generalization of the Frobenius norm. Note that the
resulting GPCA factors U
and V
are orthogonal with
respect to the row and column generalizing operators: U^T Q U = I
and V^T R V = I
. Generalized PCA can be interpreted as finding
major modes of variation that are independent from the generalizing
operators. Thus, if Q
and R
encode noise structures
(see laplacian
) or noise covariances (see Exp.cov
),
then GPCA finds patterns separate from the structure of the noise.
The generalizing operators, Q
and R
, must be positive
semidefinite and have operator norm one. Note that if these are the
identity matrix, then GPCA is equivalent to PCA and gpca
returns
the SVD of X
. Smoothers, such as covariances (see
Exp.cov
,Exp.simple.cov
,Rad.cov
,
stationary.cov
,cubic.cov
,stationary.taper.cov
,
wendland.cov
), and inverse smoothers (see laplacian
)
can be used as generalizing operators for data in which variables are associated
with a specific location (e.g. image data and spatiotemporal data).
This function has the option of using one of two algorithms to compute
the solution. The deflation = FALSE
option computes the
eigendecomposition
of a quadratic form of dimension min(n,p)
to find U
or V
and finds the other factor by regression. The
deflation = TRUE
option finds each factor using the generalized power algorithm and
performs to deflation to compute multiple factors. The deflation
= FALSE
option is generally faster, and especially so when one dimension is much
smaller than the other. The option deflation = TRUE
is faster only
if both dimensions are large n,p > 5,000
and K
is small.
Value
U 
The left GPCA factors, an 
V 
The right GPCA factors, an 
D 
A vector of the 
cumm.prop.var 
Cumulative proportion of variance explained by
the first 
prop.var 
Proportion of variance explained by each component. 
Author(s)
Frederick Campbell
References
Genevera I. Allen, Logan Grosenick, and Jonathan Taylor, "A generalized least squares matrix decomposition", arXiv:1102.3074, 2011.
See Also
laplacian
, Exp.cov
,
sgpca
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20  data(ozone2)
ind = which(apply(is.na(ozone2$y),2,sum)==0)
X = ozone2$y[,ind]
n = nrow(X)
p = ncol(X)
#Generalizing Operators  SpatioTemporal Smoothers
R = Exp.cov(ozone2$lon.lat[ind,],theta=5)
er = eigen(R,only.values=TRUE);
R = R/max(er$values)
Q = Exp.cov(c(1:n),c(1:n),theta=3)
eq = eigen(Q,only.values=TRUE)
Q = Q/max(eq$values)
#SVD
fitsvd = gpca(X,diag(n),diag(p),1)
#GPCA
fitgpca = gpca(X,Q,R,1)
fitgpca$prop.var #proportion of variance explained
