This is a collection of **R** functions I've used to plot singular vectors, principal components, and related quantities.
Often, I explore multiple ways of visualizing outputs of dimension reduction techniques; it may aid in confirming correct applications, validating against known labels, or generating new hypotheses.
Please feel free to add your favorite ways of visualizing dimension reduction techniques or latent variable models.

To use a stable version from CRAN:

```
install.packages("svdvis")
```

To use a development version from GitHub:

```
install.packages("devtools")
library("devtools")
install_github("ncchung/svdvis")
```

This package provides several visualization functions for singular value decomposition (SVD), principal component analysis (PCA), factor analysis (FA), logistic factor analysis (LFA), and other related methods.

To use in this vignette, we create a simulated dataset, with `m=500`

variables (rows) and `n=20`

samples (columns). Particularly, it contains a latent variable that resembles a case-control study. After applying SVD to the datasets, we also name the rows and the columns of the right singular vectors `svd.obj$v`

for labels in visualization.

```{r, fig.show='hold'}
set.seed(1234)
library(svdvis)
B = c(runif(100, min=0, max=1), rep(0,400))
L = c(rep(1, 10), rep(-1, 10))
L = L / sd(L)
E = matrix(rnorm(500*20), nrow=500)
Y = B %*% t(L) + E

svd.obj = svd(Y) colnames(svd.obj$v) = paste0("V",1:20) rownames(svd.obj$v) = paste0("Sample",1:20)

```
In this setup, a few right singular vectors contained in `svd.obj$v` may capture systematic variation in the observed data `Y`. Since the right singular vectors are ordered according to the singular values in a descending order, the top (or first) `r` right singular vectors refers to `svd.obj$v[,1:r]`. Note that principal components (PCs) can be obtained by multiplying singular values `svd.obj$d` and right singular vectors `svd.obj$v`. All examples in this vignette and all functions in `svdvis` can utilize `weights="sv"` to quickly visualize PCs.
### Scree plot
A scree plot visualizes percentages of variance explained by singular vectors in a descending order. `svd.scree` is simply a wrapper function using `ggplot2`. In high-dimensional datasets, the number of points in a scree plot may be too large. It may be good to look at a subset of singular values. You can specify `subr` in `svd.scree` function, which "zooms in" to the top `subr` singular values.
```{r}
svd.scree(svd.obj, subr=5,
axis.title.x="Full scree plot", axis.title.y="% Var Explained")
```

Note that if `subr`

is not specified, one full-sized scree plot is returned.

Scatter plots are often utilized to look at the top 2 right singular vectors. `svd.scatter`

produces a matrix of scatterplots of all pairs among `r`

right singular vectors.

```
svd.scatter(svd.obj)
```

The above plot crams in too many pairs. We can specify `r`

to visualize only the top `r`

right singular vectors. In this example, additional arguments such as `group`

and `alpha`

are included:

```
svd.scatter(svd.obj, r=3, alpha=.5,
group=c(rep("Group 1", 10), rep("Group 2", 10)))
```

Let's create a heat map of the top `r=5`

right singular vectors:

```
svd.heatmap(svd.obj, r=5)
```

A parallel coordinates plot shows `r`

dimensions in `r`

parallel lines, which are equally spaced. All data points are rescaled to (0,1) and the top `r`

singular vectors are visualized from left to right. Different groups are colored accordingly:

```
svd.parallel(svd.obj, r=5, alpha=.5,
group=c(rep("Group 1", 10), rep("Group 2", 10)))
```

A radial coordinates plot visualize `r`

dimensions in a circle, around where `r`

anchors are placed. Each of `n`

vectors is mapped onto a circle, using its data as spring constants. Prior to mapping, each column is rescaled to have numeric values between 0 and 1.

```
svd.radial(svd.obj, r=3,
group=c(rep("Group 1", 10), rep("Group 2", 10)))
```

All functions in `svdvis`

use `ggplot2`

. Therefore, the visual output can be saved and modified in a conventional manner. Feel free to experiment the source codes for more complex or interesting cases.

While this vignette focused on using the results of SVD, an optional argument `weights="sv"`

can be used for visualizing PCs. Note that `weights="sv"`

is simply calling `weights = svd.obj$d[1:r]`

.

Outputs from other dimension reduction methods can be used. Provide the `r`

vectors to `svd.obj`

in any function. Note that the input must be a `n * r`

matrix that contains `r`

vectors as columns. An optional argument `group`

can be used to differentially indicate `n`

samples (points, lines, etc).

For example, logistic factor analysis captures population structure from a large and diverse set of genome sequences and is related to SVD and PCA. A R package `lfa`

computes `r`

logistic factors, as columns. You can easily make a parallel coordinates plot (and others) by `svd.parallel(svd.obj=lfa(genotypes, 10))`

.

ncchung/svdvis documentation built on May 23, 2019, 1:05 p.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.