two_different_clusters_data: Two Gaussian Clusters With Unequal Standard Deviations

View source: R/misread-tsne.R

two_different_clusters_dataR Documentation

Two Gaussian Clusters With Unequal Standard Deviations

Description

Two gaussians with equal size but unequal bandwidths, from "How to Use t-SNE Effectively".

Usage

two_different_clusters_data(n, dim = 50, scale = 10)

Arguments

n

Number of points per gaussian.

dim

Dimension of the gaussians.

scale

Amount to reduce the standard deviation of the second cluster, relative to the first.

Details

Creates a dataset consisting of two symmetric gaussian distributions with equal number of points, but different standard deviations: the standard deviations of the second cluster will be 1/scale of the other. Clusters are separated by 20 units. Points are colored depending on which cluster they belong to.

Value

Data frame with coordinates in the X1, X2 ... Xdim columns, and color in the color column.

References

http://distill.pub/2016/misread-tsne/

See Also

Other distill functions: circle_data(), cube_data(), gaussian_data(), grid_data(), link_data(), long_cluster_data(), long_gaussian_data(), ortho_curve(), random_circle_cluster_data(), random_circle_data(), random_jump(), random_walk(), simplex_data(), subset_clusters_data(), three_clusters_data(), trefoil_data(), two_clusters_data(), unlink_data()

Examples

df <- two_different_clusters_data(n = 50, dim = 2, scale = 5)

jlmelville/snedata documentation built on Jan. 13, 2024, 2:06 a.m.