Description Usage Arguments Value Examples
This function conducts supervised sparse and functional principal
component analysis by fitting the SupSVD model
X=UV' + E
U=YB + F
where X is an observed primary data matrix (to be decomposed), U is a latent score
matrix, V is a loading matrix, E is measurement noise, Y is an observed
auxiliary supervision matrix, B is a coefficient matrix, and F is a
random effect matrix.
It decomposes the primary data matrix X into low-rank
components, while taking into account many different features: 1)
potential supervision from any auxiliary data Y measured on the same
samples; 2) potential smoothness for loading vectors V (for functional
data); 3) sparsity in supervision coefficients B and loadings V (for variable
selection).
It is a very general dimension reduction method that subsumes
PCA, sparse PCA, functional PCA, supervised PCA, etc as special cases.
See more details in 2016 JCGS paper "Supervised sparse and
functional principal component analysis" by Gen Li, Haipeng Shen, and
Jianhua Z. Huang.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
Y |
n*q (column centered) auxiliary data matrix, rows are samples and columns are variables |
X |
n*p (column centered) primary data matrix, which we want to decompose. rows are samples (matched with Y) and columns are variables |
r |
positive scalar, prespecified rank (r should be smaller than n and p) |
ind_lam |
0 or 1 (default=1, sparse loading), sparsity index for loadings |
ind_alp |
0 or 1 (default=1, smooth loading), smoothness index for loadings |
ind_gam |
0 or 1 (default=1, sparse coefficient), sparsity index for supervision coefficients. Note: if gamma is set to be 0, Y must have q<n to avoid overfitting; if gamma is set to be 1, then it can handle high dimensional supervision Y |
ind_Omg |
p*p symmetric positive semi-definite matrix for smoothness penalty (default is for evenly spaced data) Note: only change this if you have unevenly spaced functional data X |
Omega |
?? |
max_niter |
scalar (default=1E3), max number of overall iterations |
convg_thres |
positive scalar (default=1E-6), overall convergence threshold |
vmax_niter |
scalar (default=1E2), max number of iterations for estimating each loading vector |
vconvg_thres |
positive scalar (default=1E-4), convergence threshold for the proximal gradient descent algorithm for estimating each loading vector |
list with components
B: |
q*r coefficient matrix of Y on the scores of X,maybe sparse if gamma=1 |
V: |
p*r loading matrix of X, each column has norm 1, but no strict orthogonality because of sparsity and smoothness. If lambda=1, V is sparse; if alpha=1, each column of V is smooth |
U: |
n*r score matrix of X, conditional expectation of random scores, no strict orthogonality |
se2: |
scalar, variance of measurement error in the primary data X |
Sf: |
r*r diagonal covariance matrix, for random effects (see paper) |
Note: Essentially, U and V are the most important output for dimension reduction purpose as in PCA or SVD.
1 2 3 4 5 6 7 8 9 10 11 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.