SupPCA: Fit a supervised singular value decomposition (SupSVD) model

Description Usage Arguments Value Examples

View source: R/SupPCA.R

Description

This function fits the SupSVD model: X=UV' + E, U=YB + F where X is an observed primary data matrix (to be decomposed), U is a latent score matrix, V is a loading matrix, E is measurement noise, Y is an observed auxiliary supervision matrix, B is a coefficient matrix, and F is a random effect matrix.
It is a generalization of principal component analysis (PCA) or singular value decomposition (SVD). It decomposes the primary data matrix X into low-rank components, while taking into account potential supervision from any auxiliary data Y measured on the same samples.

See more details in 2016 JMVA paper "Supervised singular value decomposition and its asymptotic properties" by Gen Li, Dan Yang, Andrew B Nobel and Haipeng Shen.

Usage

1
SupPCA(Y, X, r)

Arguments

Y

n*q (column centered) auxiliary data matrix, rows are samples and columns are stats::variables (must have linearly independent columns to avoid overfitting)

X

n*p (column centered) primary data matrix, which we want to decompose. rows are samples (matched with Y) and columns are variables

r

positive scalar, prespecified rank (r < min(n,p))

Value

list with components

B:

q*r coefficient matrix of Y on the scores of X, maybe sparse if gamma=1

V:

p*r loading matrix of X, with orthonormal columns

U:

n*r score matrix of X, conditional expectation of random scores

se2:

scalar, variance of measurement error in the primary data X

Sf:

r*r diagonal covariance matrix, for random effects (see paper)

Note: Essentially, U and V are the most important output for dimension reduction purpose as in PCA or SVD.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
r=2
Y <- matrix(rnorm(400,0,1),nrow=100)
B <- c(-1,1,-sqrt(3/2),-1)
B <- cbind(B,c(1,1,-1,sqrt(3/2)))
V <- matrix(rnorm(68*2),68,2)
Fmatrix <- matrix(MASS::mvrnorm(n=1*100,rep(0,2),matrix(c(9,0,0,4),2,2)),100,2)
E <- matrix(rnorm(100*68,0,3),100,68)
Yc <- scale(Y,center=TRUE,scale=FALSE)

# Case 1 (supsvd) X = YBV^T+FV^T+E
X1 <- Y%*%tcrossprod(B,V)+tcrossprod(Fmatrix,V)+E
X1c <- scale(X1,center=TRUE,scale=FALSE)
SupPCA(Yc,X1c,r)
#  Case 2 (PCA) X = FV^T+E
X2 <- tcrossprod(Fmatrix,V)+E
X2c <-scale(X2,center=TRUE,scale=FALSE)
SupPCA(Yc,X2c,r)
# Case 3 (RRR) X = YBV^T+E
X3 <- Y%*%tcrossprod(B,V)+E
X3c <- scale(X3,center=TRUE,scale=FALSE)
SupPCA(Yc,X3c,r)

SuperPCA documentation built on July 26, 2021, 5:06 p.m.

Related to SupPCA in SuperPCA...