# iprcomp: Improved Function for Obtaining Principal Components In statVisual: Statistical Visualization Tools

## Description

Calculate principal components when data contains missing values.

## Usage

 `1` ```iprcomp(dat, center = TRUE, scale. = FALSE) ```

## Arguments

 `dat` n by p matrix. rows are subjects and columns are variables `center` logical. Indicates if each row of `dat` needs to be mean-centered `scale.` logical. Indicates if each row of `dat` needs to be scaled to have variance one

## Details

We first set missing values as median of the corresponding variable, then call the function `prcomp`. This is a very simple solution. The user can use their own imputation methods before calling `prcomp`.

## Value

A list of 3 elements

 `sdev ` square root of the eigen values `rotation ` a matrix with columns are eigen vectors, i.e., projection direction `x ` a matrix with columns are principal components

## Author(s)

Wenfei Zhang <Wenfei.Zhang@sanofi.com>, Weiliang Qiu <Weiliang.Qiu@sanofi.com>, Xuan Lin <Xuan.Lin@sanofi.com>, Donghui Zhang <Donghui.Zhang@sanofi.com>

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32``` ```# generate simulated data set.seed(1234567) dat.x = matrix(rnorm(500), nrow = 100, ncol = 5) dat.y = matrix(rnorm(500, mean = 2), nrow = 100, ncol = 5) dat = rbind(dat.x, dat.y) grp = c(rep(0, 100), rep(1, 100)) print(dim(dat)) res = iprcomp(dat, center = TRUE, scale. = FALSE) # for each row, set one artificial missing value dat.na=dat nr=nrow(dat.na) nc=ncol(dat.na) for(i in 1:nr) { posi=sample(x=1:nc, size=1) dat.na[i,posi]=NA } res.na = iprcomp(dat.na, center = TRUE, scale. = FALSE) ## # pca plot ## par(mfrow = c(3,1)) # original data without missing values plot(x = res\$x[,1], y = res\$x[,2], xlab = "PC1", ylab = "PC2") # perturbed data with one NA per probe # the pattern of original data is captured plot(x = res.na\$x[,1], y = res.na\$x[,2], xlab = "PC1", ylab = "PC2", main = "with missing values") par(mfrow = c(1,1)) ```

statVisual documentation built on Feb. 21, 2020, 1:08 a.m.