two_clusters_data: Two Equal Size Clusters

View source: R/misread-tsne.R

two_clusters_dataR Documentation

Two Equal Size Clusters

Description

Two gaussians with equal size and bandwidth, from "How to Use t-SNE Effectively".

Usage

two_clusters_data(n, dim = 50)

Arguments

n

Number of points per gaussian.

dim

Dimension of the gaussians. You may pass a vector of length 2 to create clusters of different dimensionalities, with the smaller cluster having zeros in the extra dimensions.

Details

Creates a dataset consisting of two symmetric gaussian distributions with equal number of points and standard deviation 1, separated by a distance of 10 units. Points are colored depending on which cluster they belong to.

Value

Data frame with coordinates in the X1, X2 ... Xdim columns, and color in the color column.

References

http://distill.pub/2016/misread-tsne/

See Also

Other distill functions: circle_data(), cube_data(), gaussian_data(), grid_data(), link_data(), long_cluster_data(), long_gaussian_data(), ortho_curve(), random_circle_cluster_data(), random_circle_data(), random_jump(), random_walk(), simplex_data(), subset_clusters_data(), three_clusters_data(), trefoil_data(), two_different_clusters_data(), unlink_data()

Examples

df <- two_clusters_data(n = 50, dim = 2)
# two clusters with 10 members each, first 10 sampled from a 3D gaussian,
# second 10 are sampled from a 4D gaussian
df <- two_clusters_data(n = 10, dim = c(3, 4))

jlmelville/snedata documentation built on March 5, 2025, 12:22 p.m.