subset_clusters_data | R Documentation |
One tiny gaussian cluster inside of a big cluster from "How to Use t-SNE Effectively".
subset_clusters_data(n, dim = 2, big_sdev = 50)
n |
Number of points per gaussian. |
dim |
Dimension of the gaussians. |
big_sdev |
Standard deviation of the bigger cluster, default 50. The smaller cluster has a standard deviation of 1. |
Creates a dataset consisting of two gaussians with the same center, but
with the first cluster having a standard deviation of 1, and the second
having a standard deviation of big_sdev
(default 50). Points are
colored depending on which cluster they belong to (small cluster is dark
powder blue, large is light orange).
Data frame with coordinates in the X1
, X2
...
Xdim
columns, and color in the color
column.
http://distill.pub/2016/misread-tsne/
Other distill functions:
circle_data()
,
cube_data()
,
gaussian_data()
,
grid_data()
,
link_data()
,
long_cluster_data()
,
long_gaussian_data()
,
ortho_curve()
,
random_circle_cluster_data()
,
random_circle_data()
,
random_jump()
,
random_walk()
,
simplex_data()
,
three_clusters_data()
,
trefoil_data()
,
two_clusters_data()
,
two_different_clusters_data()
,
unlink_data()
df <- subset_clusters_data(n = 50, dim = 2)
# 10D example where the big cluster is only twice the standard deviation of
# the small cluster
df <- subset_clusters_data(n = 50, dim = 10, big_sdev = 2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.