Description Usage Arguments Details Value Author(s) References See Also Examples
Computes the rank K Generalized PCA (GPCA) solution.
1 |
X |
The |
Q |
The row generalizing operator, an |
R |
The column generalizing operator, an |
K |
The number of GPCA components to compute. The default value is one. |
deflation |
Algorithm used to calculate the solution. Default is
|
The Generalized PCA solution maximizes the sample variance of the data
in an inner-product space induced by the row and column generalizing
operators, Q and R, and also finds the best low-rank
approximation to the data as
measured by a generalization of the Frobenius norm. Note that the
resulting GPCA factors U and V are orthogonal with
respect to the row and column generalizing operators: U^T Q U = I
and V^T R V = I. Generalized PCA can be interpreted as finding
major modes of variation that are independent from the generalizing
operators. Thus, if Q and R encode noise structures
(see laplacian) or noise covariances (see Exp.cov),
then GPCA finds patterns separate from the structure of the noise.
The generalizing operators, Q and R, must be positive
semi-definite and have operator norm one. Note that if these are the
identity matrix, then GPCA is equivalent to PCA and gpca returns
the SVD of X. Smoothers, such as covariances (see
Exp.cov,Exp.simple.cov,Rad.cov,
stationary.cov,cubic.cov,stationary.taper.cov,
wendland.cov), and inverse smoothers (see laplacian)
can be used as generalizing operators for data in which variables are associated
with a specific location (e.g. image data and spatio-temporal data).
This function has the option of using one of two algorithms to compute
the solution. The deflation = FALSE option computes the
eigen-decomposition
of a quadratic form of dimension min(n,p) to find U
or V and finds the other factor by regression. The
deflation = TRUE
option finds each factor using the generalized power algorithm and
performs to deflation to compute multiple factors. The deflation
= FALSE
option is generally faster, and especially so when one dimension is much
smaller than the other. The option deflation = TRUE is faster only
if both dimensions are large n,p > 5,000 and K
is small.
U |
The left GPCA factors, an |
V |
The right GPCA factors, an |
D |
A vector of the |
cumm.prop.var |
Cumulative proportion of variance explained by
the first |
prop.var |
Proportion of variance explained by each component. |
Frederick Campbell
Genevera I. Allen, Logan Grosenick, and Jonathan Taylor, "A generalized least squares matrix decomposition", arXiv:1102.3074, 2011.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | data(ozone2)
ind = which(apply(is.na(ozone2$y),2,sum)==0)
X = ozone2$y[,ind]
n = nrow(X)
p = ncol(X)
#Generalizing Operators - Spatio-Temporal Smoothers
R = Exp.cov(ozone2$lon.lat[ind,],theta=5)
er = eigen(R,only.values=TRUE);
R = R/max(er$values)
Q = Exp.cov(c(1:n),c(1:n),theta=3)
eq = eigen(Q,only.values=TRUE)
Q = Q/max(eq$values)
#SVD
fitsvd = gpca(X,diag(n),diag(p),1)
#GPCA
fitgpca = gpca(X,Q,R,1)
fitgpca$prop.var #proportion of variance explained
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.