View source: R/plot_similarity.R
plot_similarity | R Documentation |
This function displays pairwise distances between samples as a heatmap.
plot_similarity( dat, group = NULL, covar = NULL, dist = "euclidean", p = 2L, top = NULL, filter_method = "pairwise", center = FALSE, hclustfun = "complete", pal_group = "npg", pal_covar = "Blues", pal_tiles = "RdBu", title = "Sample Similarity Matrix" )
dat |
Omic data matrix or matrix-like object with rows corresponding to
probes and columns to samples. It is strongly recommended that data be
filtered and normalized prior to plotting. Raw counts stored in |
group |
Optional character or factor vector of length equal to sample size. Alternatively, a data frame or list of such vectors, optionally named. |
covar |
Optional continuous covariate of length equal to sample size. Alternatively, a data frame or list of such vectors, optionally named. |
dist |
Distance measure to be used. Supports all methods available in
|
p |
Power of the Minkowski distance. |
top |
Optional number (if > 1) or proportion (if < 1) of top probes to be used for distance calculations. |
filter_method |
String specifying whether to apply a |
center |
Center each probe prior to computing distances? |
hclustfun |
The agglomeration method to be used for hierarchical
clustering. Supports any method available in |
pal_group |
String specifying the color palette to use if |
pal_covar |
String specifying the color palette to use if |
pal_tiles |
String specifying the color palette to use for heatmap
tiles. Options include the complete collection of
|
title |
Optional plot title. |
Similarity matrices are a valuable tool for exploratory data analysis. A hierarchical clustering dendrogram atop the figure helps identify potential clusters and/or outliers in the data. Annotation tracks can help investigate associations with phenotypic features.
Different distance metrics and agglomeration methods can lead to different results. The default settings, which perform average linkage hierarchical clustering on a Euclidean distance matrix, are mathematically straightforward and commonly used for omic EDA. Complete linkage is also fairly common.
Other available distance measures include: "maximum"
,
"manhattan"
, "canberra"
, "minkowski"
, "cosine"
,
"pearson"
, "kendall"
, "spearman"
, "bray"
,
"kulczynski"
, "jaccard"
, "gower"
, "altGower"
,
"morisita"
, "horn"
, "mountford"
, "raup"
,
"binomial"
, "chao"
, "cao"
, "mahalanobis"
, "MI"
,
or "KLD"
. See dist_mat
for more details on these methods
and links to documentation on each.
The top
argument optionally filters data using either probewise
variance (if filter_method = "common"
) or the leading fold change
method of Smyth et al. (if filter_method = "pairwise"
). See
plot_mds
for more details.
mat <- matrix(rnorm(5000), nrow = 1000, ncol = 5) plot_similarity(mat, title = "Nothin' Doin'") library(DESeq2) dds <- makeExampleDESeqDataSet() dds <- rlog(dds) plot_similarity(dds, group = colData(dds)$condition, title = "Somethin' Cookin'")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.