plot_mds: MDS Plot

View source: R/plot_mds.R

plot_mdsR Documentation

MDS Plot

Description

This function plots a low-dimensional projection of an omic data matrix using multi-dimensional scaling.

Usage

plot_mds(
  dat,
  group = NULL,
  covar = NULL,
  metric = TRUE,
  dist = "euclidean",
  p = 2L,
  top = 500L,
  filter_method = "pairwise",
  center = FALSE,
  pcs = c(1L, 2L),
  label = FALSE,
  pal_group = "npg",
  pal_covar = "Blues",
  size = NULL,
  alpha = NULL,
  title = "MDS",
  legend = "right",
  hover = FALSE,
  D3 = FALSE
)

Arguments

dat

Omic data matrix or matrix-like object with rows corresponding to probes and columns to samples. It is strongly recommended that data be filtered and normalized prior to plotting. Raw counts stored in DGEList or DESeqDataSet objects are automatically extracted and transformed to the log2-CPM scale, with a warning. Alternatively, an object of class dist which can be directly input to the MDS algorithm.

group

Optional character or factor vector of length equal to sample size, or up to two such vectors organized into a list or data frame. Supply legend title(s) by passing a named list or data frame.

covar

Optional continuous covariate. If non-NULL, then plot can render at most one group variable. Supply legend title by passing a named list or data frame.

metric

Logical. Perform classical (i.e. metric) MDS or nonmetric MDS? See Details.

dist

Distance measure to be used. Supports all methods available in dist, Rfast::Dist, and vegdist, as well as those implemented in the bioDist package. See Details.

p

Power of the Minkowski distance.

top

Optional number (if > 1) or proportion (if < 1) of top probes to be used for MDS.

filter_method

String specifying whether to apply a "pairwise" or "common" filter if top is non-NULL. See Details.

center

Center each probe prior to computing distances?

pcs

Vector specifying which principal coordinates to plot. Must be of length two unless D3 = TRUE.

label

Label data points by sample name? Defaults to FALSE unless group and covar are both NULL. If TRUE, then plot can render at most one phenotypic feature.

pal_group

String specifying the color palette to use if group is non-NULL, or a vector of such strings with length equal to the number of vectors passed to group. Options include "ggplot", all qualitative color schemes available in RColorBrewer, and the complete collection of ggsci palettes. Alternatively, a character vector of colors with length equal to the cumulative number of levels in group.

pal_covar

String specifying the color palette to use if covar is non-NULL, or a vector of such strings with length equal to the number of vectors passed to covar. Options include the complete collection of viridis palettes, as well as all sequential color schemes available in RColorBrewer. Alternatively, a character vector of colors representing a smooth gradient, or a list of such vectors with length equal to the number of continuous variables to visualize.

size

Point size.

alpha

Point transparency.

title

Optional plot title.

legend

Legend position. Must be one of "bottom", "left", "top", "right", "bottomright", "bottomleft", "topleft", or "topright".

hover

Show sample name by hovering mouse over data point? If TRUE, the plot is rendered in HTML and will either open in your browser's graphic display or appear in the RStudio viewer.

D3

Render plot in three dimensions?

Details

MDS is an iterative algorithm for embedding high-dimensional manifolds in two or three dimensions. Classical MDS is implemented by the cmdscale function, which finds the optimal two-dimensional projection of a distance matrix by minimizing the strain of the coordinate mapping (Torgerson, 1958). Nonmetric MDS (NMDS) is implemented by the monoMDS function, which uses isotonic regression to find the monotonic transformation that minimizes the stress of the embedding (Kruskal, 1964).

MDS requires a distance matrix as input. Available distance measures include: "euclidean", "maximum", "manhattan", "canberra", "minkowski", "cosine", "pearson", "kendall", "spearman", "bray", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup", "binomial", "chao", "cao", "mahalanobis", "MI", or "KLD". Some distance measures are unsuitable for certain types of data. See dist_mat for more details on these methods and links to documentation for each. Users may also directly input a distance matrix calculated using some custom method.

If top is non-NULL, then data can either be filtered by probewise variance (filter_method = "common") or using the leading fold change method of Smyth et al. (filter_method = "pairwise"). In the latter case, pairwise distances are calculated using only the top most differentially expressed probes between the two samples. This method is appropriate when different molecular pathways are relevant for distinguishing different pairs of samples. To run MDS on the complete data, set top = NULL. This is functionally equivalent to running PCA on the full matrix when dist = "euclidean". See plot_pca.

References

Cox, T.F. & Cox, M.A.A. (2001). Multidimensional Scaling. Second edition. Chapman and Hall.

Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29(1): 1-27.

Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., & Smyth, G.K. (2015). limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res., 43(7): e47.

Torgerson, W.S. (1958). Theory and Methods of Scaling. New York: Wiley.

See Also

plotMDS, plot_pca

Examples

mat <- matrix(rnorm(1000 * 5), nrow = 1000, ncol = 5)
plot_mds(mat)

library(DESeq2)
dds <- makeExampleDESeqDataSet()
dds <- rlog(dds)
plot_mds(dds, group = colData(dds)$condition)


dswatson/bioplotr documentation built on March 3, 2023, 9:43 p.m.