Description Usage Arguments Details Value Author(s) References See Also Examples
Computes the rank K
Generalized PCA (GPCA) solution.
1 |
X |
The |
Q |
The row generalizing operator, an |
R |
The column generalizing operator, an |
K |
The number of GPCA components to compute. The default value is one. |
deflation |
Algorithm used to calculate the solution. Default is
|
The Generalized PCA solution maximizes the sample variance of the data
in an inner-product space induced by the row and column generalizing
operators, Q
and R
, and also finds the best low-rank
approximation to the data as
measured by a generalization of the Frobenius norm. Note that the
resulting GPCA factors U
and V
are orthogonal with
respect to the row and column generalizing operators: U^T Q U = I
and V^T R V = I
. Generalized PCA can be interpreted as finding
major modes of variation that are independent from the generalizing
operators. Thus, if Q
and R
encode noise structures
(see laplacian
) or noise covariances (see Exp.cov
),
then GPCA finds patterns separate from the structure of the noise.
The generalizing operators, Q
and R
, must be positive
semi-definite and have operator norm one. Note that if these are the
identity matrix, then GPCA is equivalent to PCA and gpca
returns
the SVD of X
. Smoothers, such as covariances (see
Exp.cov
,Exp.simple.cov
,Rad.cov
,
stationary.cov
,cubic.cov
,stationary.taper.cov
,
wendland.cov
), and inverse smoothers (see laplacian
)
can be used as generalizing operators for data in which variables are associated
with a specific location (e.g. image data and spatio-temporal data).
This function has the option of using one of two algorithms to compute
the solution. The deflation = FALSE
option computes the
eigen-decomposition
of a quadratic form of dimension min(n,p)
to find U
or V
and finds the other factor by regression. The
deflation = TRUE
option finds each factor using the generalized power algorithm and
performs to deflation to compute multiple factors. The deflation
= FALSE
option is generally faster, and especially so when one dimension is much
smaller than the other. The option deflation = TRUE
is faster only
if both dimensions are large n,p > 5,000
and K
is small.
U |
The left GPCA factors, an |
V |
The right GPCA factors, an |
D |
A vector of the |
cumm.prop.var |
Cumulative proportion of variance explained by
the first |
prop.var |
Proportion of variance explained by each component. |
Frederick Campbell
Genevera I. Allen, Logan Grosenick, and Jonathan Taylor, "A generalized least squares matrix decomposition", arXiv:1102.3074, 2011.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | data(ozone2)
ind = which(apply(is.na(ozone2$y),2,sum)==0)
X = ozone2$y[,ind]
n = nrow(X)
p = ncol(X)
#Generalizing Operators - Spatio-Temporal Smoothers
R = Exp.cov(ozone2$lon.lat[ind,],theta=5)
er = eigen(R,only.values=TRUE);
R = R/max(er$values)
Q = Exp.cov(c(1:n),c(1:n),theta=3)
eq = eigen(Q,only.values=TRUE)
Q = Q/max(eq$values)
#SVD
fitsvd = gpca(X,diag(n),diag(p),1)
#GPCA
fitgpca = gpca(X,Q,R,1)
fitgpca$prop.var #proportion of variance explained
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.