In taryue/GMDecomp: Visualization tools for the Generalized Matrix Decomposition (GMD).

Introduction

In this vignette, we illustrate how to construct the GMD-biplot and screeplot using the tobacco data set from [@Satten2017]. This data set includes 15 smokeless tobacco products: 6 dry snuffs, 7 moist snuffs, and 2 toombak samples from Sudan. Three separate (replicate) observations (starting with sample preparation) were made of each product, so that in total 45 observations are available. Each observation has a 271 × 1 vector of taxon counts. To make the measurements comparable, we consider the centered log ratio (CLR) transformation of the data set. Additionally, the squared weighted UniFrac distance, denoted $\Delta$, is used to measure the distance between samples. The corresponding similarity kernel $H$ is derived from $\Delta$ using the Gower's centering matrix.

Step 1: Loading the tobacco data set

We first load our R package GMDecomp.

library(GMDecomp)

The data object tobacco_clr in the package include

data: the CLR transformed out table, with rows for samples and columns for OTU.
$H$: the similarity kernel derived from the squared weighted UniFrac distance.
sample.col: the color for plotting each sample point.
sample.pch: the shape for plotting each sample point.
out.names: the taxonomic name of each OTU.

Step 2: Generalized Matrix Decomposition

To construct the GMD-biplot and screeplot, we need to first perform the generalized matrix decomposition [@Allen2014] of the data with respect to $H$. This can be easily achieved using the following line:

tobacco.gmd <- GMD(X = tobacco_clr$data, H = tobacco_clr$H, Q = diag(1, dim(tobacco_clr$data)[2]), K = 10)

Note that here we don't have a similarity kernel for the OTUs, so $Q$ is set to be an identity matrix. One can set $Q$ to be any informative positive semi-definite matrix, if such information is available. Also, here we set $K = 10$, since we want to display the screeplot the top 10 GMD components. If only the GMD-biplot is needed, one can set $K = 2$, which may save computational time.

tabacco.gmd is a list of class gmd, which consists of the following variables.

U: the left GMD components with 45 rows and 10 columns.
S: the top 10 GMD values.
V: the right GMD components with 271 rows and 10 columns.
H: the similarity matrix for samples.
Q: the similarity matrix for OTUs.

Step 3: The GMD-biplot and screeplot

Once the GMD outputs are obtained, the screeplot can by easily constructed as follows.

screeplot(tobacco.gmd) #the screeplot of the top 10 GMD components

Note that one can select specific OTUs to display in the GMD-biplot. For this analysis, we display the top 3 OTUs that have the longest arrows.

gmd.order = order(rowSums(tobacco.gmd$V[,1:2]^2), decreasing = T)
plot.index = gmd.order[1:3]
plot.names = tobacco_clr$otu.names[plot.index]
biplot(fit = tobacco.gmd, index = plot.index, names = plot.names, sample.col = tobacco_clr$sample.color, sample.pch = tobacco_clr$sample.pch, arrow.col = 'grey50')

References

taryue/GMDecomp documentation built on Nov. 5, 2019, 10 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

taryue/GMDecomp
Visualization tools for the Generalized Matrix Decomposition (GMD).

In taryue/GMDecomp: Visualization tools for the Generalized Matrix Decomposition (GMD).

Introduction

Step 1: Loading the tobacco data set

Step 2: Generalized Matrix Decomposition

Step 3: The GMD-biplot and screeplot

References

R Package Documentation

Browse R Packages

We want your feedback!

taryue/GMDecomp Visualization tools for the Generalized Matrix Decomposition (GMD).

In taryue/GMDecomp: Visualization tools for the Generalized Matrix Decomposition (GMD).

Introduction

Step 1: Loading the tobacco data set

Step 2: Generalized Matrix Decomposition

Step 3: The GMD-biplot and screeplot

References

R Package Documentation

Browse R Packages

We want your feedback!

taryue/GMDecomp
Visualization tools for the Generalized Matrix Decomposition (GMD).