tSNE_df | R Documentation |
tSNE_df
makes use of Rtsne::Rtsne
, which is a wrapper for the C++ implementation of Barnes-Hut t-Distributed
Stochastic Neighbor Embedding. tSNE is a method for constructing a low dimensional embedding of high-dimensional data, distances, or
similarities. Exact t-SNE can be computed by setting theta = 0.0
.
tSNE_df( data, dims = 2, initial_dims = 50, perplexity = 3, theta = 0.5, check_duplicates = TRUE, pca = TRUE, partial_pca = FALSE, max_iter = 1000, verbose = FALSE, is_distance = FALSE, Y_init = NULL, pca_center = TRUE, pca_scale = FALSE, normalize = TRUE, stop_lying_iter = ifelse(is.null(Y_init), 250L, 0L), mom_switch_iter = ifelse(is.null(Y_init), 250L, 0L), momentum = 0.5, final_momentum = 0.8, eta = 200, exaggeration_factor = 12, num_threads = 1 )
data |
A data frame object or matrix. |
dims |
integer; Output dimensionality (default: 2) |
initial_dims |
integer; the number of dimensions that should be retained in the initial PCA step (default: 50) |
perplexity |
numeric; Perplexity parameter (should not be bigger than 3 * perplexity < nrow(X) - 1, see details for interpretation) |
theta |
numeric; Speed/accuracy trade-off (increase for less accuracy), set to 0.0 for exact TSNE (default: 0.5) |
check_duplicates |
logical; Checks whether duplicates are present. It is best to make sure there are no duplicates present and set this option to FALSE, especially for large datasets (default: TRUE) |
pca |
logical; Whether an initial PCA step should be performed (default: TRUE) |
partial_pca |
logical; Whether truncated PCA should be used to calculate principal components (requires the irlba package). This is faster for large input matrices (default: FALSE) |
max_iter |
integer; Number of iterations (default: 1000) |
verbose |
logical; Whether progress updates should be printed (default: global "verbose" option, or FALSE if that is not set) |
is_distance |
logical; Indicate whether X is a distance matrix (default: FALSE) |
Y_init |
matrix; Initial locations of the objects. If NULL, random initialization will be used (default: NULL). Note that when using this, the initial stage with exaggerated perplexity values and a larger momentum term will be skipped. |
pca_center |
logical; Should data be centered before pca is applied? (default: TRUE) |
pca_scale |
logical; Should data be scaled before pca is applied? (default: FALSE) |
normalize |
logical; Should data be normalized internally prior to distance calculations with |
stop_lying_iter |
integer; Iteration after which the perplexities are no longer exaggerated (default: 250, except when Y_init is used, then 0) |
mom_switch_iter |
integer; Iteration after which the final momentum is used (default: 250, except when Y_init is used, then 0) |
momentum |
numeric; Momentum used in the first part of the optimization (default: 0.5) |
final_momentum |
numeric; Momentum used in the final part of the optimization (default: 0.8) |
eta |
numeric; Learning rate (default: 200.0) |
exaggeration_factor |
numeric; Exaggeration factor used to multiply the P matrix in the first part of the optimization (default: 12.0) |
num_threads |
integer; Number of threads to use when using OpenMP, default is 1. Setting to 0 corresponds to detecting and using all available cores |
index |
integer matrix; Each row contains the identity of the nearest neighbors for each observation |
distance |
numeric matrix; Each row contains the distance to the nearest neighbors in |
D. Schmitz
Krijthe, J. H. (2015). Rtsne: T-Distributed Stochastic Neighbor Embedding using a Barnes-Hut Implementation, URL: https://github.com/jkrijthe/Rtsne
tSNE_df(gdsm_df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.