plot_similarity: Similarity Matrix Heatmap

View source: R/plot_similarity.R

plot_similarityR Documentation

Similarity Matrix Heatmap

Description

This function displays pairwise distances between samples as a heatmap.

Usage

plot_similarity(
  dat,
  group = NULL,
  covar = NULL,
  dist = "euclidean",
  p = 2L,
  top = NULL,
  filter_method = "pairwise",
  center = FALSE,
  hclustfun = "complete",
  pal_group = "npg",
  pal_covar = "Blues",
  pal_tiles = "RdBu",
  title = "Sample Similarity Matrix"
)

Arguments

dat

Omic data matrix or matrix-like object with rows corresponding to probes and columns to samples. It is strongly recommended that data be filtered and normalized prior to plotting. Raw counts stored in DGEList or DESeqDataSet objects are automatically extracted and transformed to the log2-CPM scale, with a warning. Alternatively, an object of class dist.

group

Optional character or factor vector of length equal to sample size. Alternatively, a data frame or list of such vectors, optionally named.

covar

Optional continuous covariate of length equal to sample size. Alternatively, a data frame or list of such vectors, optionally named.

dist

Distance measure to be used. Supports all methods available in dist and vegdist, as well as those implemented in the bioDist package. See Details.

p

Power of the Minkowski distance.

top

Optional number (if > 1) or proportion (if < 1) of top probes to be used for distance calculations.

filter_method

String specifying whether to apply a "pairwise" or "common" filter if top is non-NULL. See Details.

center

Center each probe prior to computing distances?

hclustfun

The agglomeration method to be used for hierarchical clustering. Supports any method available in hclust.

pal_group

String specifying the color palette to use if group is non-NULL, or a vector of such strings with length equal to the number of vectors passed to group. Options include "ggplot", all qualitative color schemes available in RColorBrewer, and the complete collection of ggsci palettes. Alternatively, a character vector of colors with length equal to the cumulative number of levels in group.

pal_covar

String specifying the color palette to use if covar is non-NULL, or a vector of such strings with length equal to the number of vectors passed to covar. Options include the complete collection of viridis palettes, as well as all sequential color schemes available in RColorBrewer. Alternatively, a character vector of colors representing a smooth gradient, or a list of such vectors with length equal to the number of continuous variables to visualize.

pal_tiles

String specifying the color palette to use for heatmap tiles. Options include the complete collection of viridis palettes, as well as all sequential and divergent color schemes available in RColorBrewer. Alternatively, a character vector of at least two colors.

title

Optional plot title.

Details

Similarity matrices are a valuable tool for exploratory data analysis. A hierarchical clustering dendrogram atop the figure helps identify potential clusters and/or outliers in the data. Annotation tracks can help investigate associations with phenotypic features.

Different distance metrics and agglomeration methods can lead to different results. The default settings, which perform average linkage hierarchical clustering on a Euclidean distance matrix, are mathematically straightforward and commonly used for omic EDA. Complete linkage is also fairly common.

Other available distance measures include: "maximum", "manhattan", "canberra", "minkowski", "cosine", "pearson", "kendall", "spearman", "bray", "kulczynski", "jaccard", "gower", "altGower", "morisita", "horn", "mountford", "raup", "binomial", "chao", "cao", "mahalanobis", "MI", or "KLD". See dist_mat for more details on these methods and links to documentation on each.

The top argument optionally filters data using either probewise variance (if filter_method = "common") or the leading fold change method of Smyth et al. (if filter_method = "pairwise"). See plot_mds for more details.

Examples

mat <- matrix(rnorm(5000), nrow = 1000, ncol = 5)
plot_similarity(mat, title = "Nothin' Doin'")

library(DESeq2)
dds <- makeExampleDESeqDataSet()
dds <- rlog(dds)
plot_similarity(dds, group = colData(dds)$condition,
                title = "Somethin' Cookin'")


dswatson/bioplotr documentation built on March 3, 2023, 9:43 p.m.