plot_tsne | R Documentation |
This function plots a low-dimensional projection of an omic data matrix using t-distributed stochastic neighbor embedding.
plot_tsne( dat, group = NULL, covar = NULL, dist = "euclidean", p = 2L, top = NULL, filter_method = "pairwise", center = FALSE, dims = c(1L, 2L), perplexity = ncol(dat)/4L, theta = 0.1, max_iter = 1000L, label = FALSE, pal_group = "npg", pal_covar = "Blues", size = NULL, alpha = NULL, title = "t-SNE", legend = "right", hover = FALSE, D3 = FALSE, ... )
dat |
Omic data matrix or matrix-like object with rows corresponding to
probes and columns to samples. It is strongly recommended that data be
filtered and normalized prior to plotting. Raw counts stored in |
group |
Optional character or factor vector of length equal to sample size, or up to two such vectors organized into a list or data frame. Supply legend title(s) by passing a named list or data frame. |
covar |
Optional continuous covariate. If non- |
dist |
Distance measure to be used. Supports all methods available in
|
p |
Power of the Minkowski distance. |
top |
Optional number (if > 1) or proportion (if < 1) of top probes to be used for t-SNE. |
filter_method |
String specifying whether to apply a |
center |
Center each probe prior to computing distances? |
dims |
Vector specifying which dimensions to plot. Must be of length
two unless |
perplexity |
How many nearest neighbors should the algorithm consider when building projections? |
theta |
Speed/accuracy tradeoff of the Barnes-Hut algorithm. See Details. |
max_iter |
Maximum number of iterations over which to minimize the loss function. |
label |
Label data points by sample name? Defaults to |
pal_group |
String specifying the color palette to use if |
pal_covar |
String specifying the color palette to use if |
size |
Point size. |
alpha |
Point transparency. |
title |
Optional plot title. |
legend |
Legend position. Must be one of |
hover |
Show sample name by hovering mouse over data point? If |
D3 |
Render plot in three dimensions? |
... |
Additional arguments to be passed to |
t-SNE is a popular machine learning method for visualizing high-dimensional datasets. It is designed to preserve local structure and aids in revealing unsupervised clusters.
plot_tsne
relies on a C++ implementation of the Barnes-Hut algorithm,
which vastly accelerates the original t-SNE projection method. An exact t-SNE
plot may be rendered by setting theta = 0
. Briefly, the algorithm
computes samplewise similarities based on distances in the original
p-dimensional space (where p = the number of probes); generates a
low-dimensional embedding of the samples based on the user-defined
perplexity
parameter; and iteratively minimizes the Kullback-Leibler
divergence between these two distributions using an efficient tree search.
See Rtsne
for more details. A thorough introduction to
and explication of the original t-SNE method and the Barnes-Hut approximation
may be found in the references below.
The Rtsne
function can operate directly on a distance matrix.
Available distance measures include: "euclidean"
, "maximum"
,
"manhattan"
, "canberra"
, "minkowski"
, "cosine"
,
"pearson"
, "kendall"
, "spearman"
, "bray"
,
"kulczynski"
, "jaccard"
, "gower"
, "altGower"
,
"morisita"
, "horn"
, "mountford"
, "raup"
,
"binomial"
, "chao"
, "cao"
, "mahalanobis"
, "MI"
,
or "KLD"
. Some distance measures are unsuitable for certain types of
data. See dist_mat
for more details on these methods and links
to documentation on each.
The top
argument optionally filters data using either probewise
variance (if filter_method = "common"
) or the leading fold change
method of Smyth et al. (if filter_method = "pairwise"
). See
plot_mds
for more details.
van der Maaten, L.J.P. (2014). Accelerating t-SNE using Tree-Based Algorithms. Journal of Machine Learning Research, 15: 3221-3245.
van der Maaten, L.J.P. & Hinton, G.E. (2008). Visualizing High-Dimensional Data Using t-SNE. Journal of Machine Learning Research, 9: 2579-2605.
Rtsne
, plot_pca
, plot_mds
mat <- matrix(rnorm(1000 * 5), nrow = 1000, ncol = 5) plot_tsne(mat) library(DESeq2) dds <- makeExampleDESeqDataSet() dds <- rlog(dds) plot_tsne(dds, group = colData(dds)$condition)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.