iprcomp: Improved Function for Obtaining Principal Components

Description Usage Arguments Details Value Author(s) Examples

View source: R/iprcomp.R

Description

Calculate principal components when data contains missing values.

Usage

1
iprcomp(dat, center = TRUE, scale. = FALSE)

Arguments

dat

n by p matrix. rows are subjects and columns are variables

center

logical. Indicates if each row of dat needs to be mean-centered

scale.

logical. Indicates if each row of dat needs to be scaled to have variance one

Details

We first set missing values as median of the corresponding variable, then call the function prcomp. This is a very simple solution. The user can use their own imputation methods before calling prcomp.

Value

A list of 3 elements

sdev

square root of the eigen values

rotation

a matrix with columns are eigen vectors, i.e., projection direction

x

a matrix with columns are principal components

Author(s)

Wenfei Zhang <Wenfei.Zhang@sanofi.com>, Weiliang Qiu <Weiliang.Qiu@sanofi.com>, Xuan Lin <Xuan.Lin@sanofi.com>, Donghui Zhang <Donghui.Zhang@sanofi.com>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# generate simulated data
set.seed(1234567)
dat.x = matrix(rnorm(500), nrow = 100, ncol = 5)
dat.y = matrix(rnorm(500, mean = 2), nrow = 100, ncol = 5)
dat = rbind(dat.x, dat.y)
grp = c(rep(0, 100), rep(1, 100))
print(dim(dat))

res = iprcomp(dat, center = TRUE, scale.  =  FALSE)

# for each row, set one artificial missing value
dat.na=dat
nr=nrow(dat.na)
nc=ncol(dat.na)
for(i in 1:nr)
{
  posi=sample(x=1:nc, size=1)
  dat.na[i,posi]=NA
}

res.na = iprcomp(dat.na, center = TRUE, scale.  =  FALSE)

##
# pca plot
##
par(mfrow = c(3,1))
# original data without missing values
plot(x = res$x[,1], y = res$x[,2], xlab = "PC1", ylab  =  "PC2")
# perturbed data with one NA per probe 
# the pattern of original data is captured
plot(x = res.na$x[,1], y = res.na$x[,2], xlab = "PC1", ylab  =  "PC2", main = "with missing values")
par(mfrow = c(1,1))

statVisual documentation built on Feb. 21, 2020, 1:08 a.m.