embedder: Create an Embedding Method

View source: R/sneer.R

embedderR Documentation

Create an Embedding Method

Description

Creates an embedding method to be used in the sneer function, allowing arbitrary combinations of cost function, kernel and normalization schemes. Several embedding methods from the literature (e.g. SNE, t-SNE, JSE, NeRV) can be created.

Usage

embedder(cost, kernel, transform = "square", kappa = 0.5, lambda = 0.5,
  beta = 1, alpha = 0, dof = 10, norm = "joint",
  importance_weight = FALSE, verbose = TRUE)

Arguments

cost

The cost function to optimize. See 'Details'. Can be abbreviated.

kernel

The function used to convert squared distances to weights. See 'Details'. Can be abbreviated.

transform

Transformation to apply to distances before applying kernel. See 'Details'. Can be abbreviated.

kappa

Controls the weighting of the "js" cost function. Must take a value between 0 (where it behaves like "KL") and 1 (where it behaves like "revKL").

lambda

Controls the weighting of the "nerv" cost function. Must take a value between 0 (where it behaves like "revKL") and 1 (where it behaves like "KL").

beta

Precision (narrowness) of the "exponential" and "heavy-tailed" kernels.

alpha

Heavy tailedness of the "heavy-tailed" kernel. A value of 0 makes the kernel behave like "exponential", and a value of 1 behaves like "heavy-tailed".

dof

Degrees of freedom of the "inhomogeneous" kernel. A value of 1 makes the kernel behave like "t-distributed", and a value approaching approaching infinity behaves like "exponential".

norm

Weight normalization to carry out. See 'Details'. Can be abbreviated.

importance_weight

If TRUE, modify the embedder to use the importance weighting method (Yang et al. 2014).

verbose

If TRUE, log information about the embedding method to the console.

Details

The cost parameter is the cost function to minimize, one of:

  • "KL". Kullback-Leibler divergence, as used in the asymmetric Stochastic Neighbor Embedding (SNE) method (Hinton and Roweis, 2002) and Symmetric Stochastic Neighbor Embedding (SSNE) method (Cook et al., 2007), and t-distributed SNE (van der Maaten and Hinton,, 2008).

  • "revKL". Kullback-Leibler divergence, with the output probability as the reference distribution. Part of the cost function used in the Neighbor Retrieval Visualizer (NeRV) method (Venna et al., 2010).

  • "nerv". Cost function used in the (NeRV) method (Venna et al., 2010).

  • "JS". Jensen-Shannon divergence, as used in the Jensen-Shannon Embedding (JSE) method (Lee et al., 2013).

The transform will carry out a transformation on the distances. One of:

  • "none". No transformation. As used in distance-based embeddings such as metric MDS.

  • "square". Square the distances. As used in probablity-based embeddings (e.g. t-SNE).

The kernel is a function to convert the transformed output distances into weights. Must be one of:

  • "exponential". Exponential function as used in the asymmetric Stochastic Neighbor Embedding (SNE) method (Hinton and Roweis, 2002) and Symmetric Stochastic Neighbor Embedding (SSNE) method (Cook et al., 2007).

  • "t-distributed". The t-distribution with one degree of freedom, as used in t-distributed SNE (van der Maaten and Hinton,, 2008).

  • "heavy-tailed". Heavy-tailedness function used in Heavy-tailed SSNE (Zhang et al. 2009).

  • "inhomogeneous". The function used in inhomogeneous t-SNE (Kitazono et al. 2016).

The norm determines how weights are converted to probabilities. Must be one of:

  • "none". No normalization, as used in metric MDS.

  • "point". Point-wise normalization, as used in asymmetric SNE, NeRV and JSE.

  • "pair". Pair-wise normalization.

  • "joint". Pair-wise normalization, plus enforcing the probabilities to be joint by averaging, as used in symmetric SNE and t-distributed SNE. Output probabilities will only be averaged if the kernel has non-uniform parameters.

You may also specify a vector of size 2, where the first member is the input normalization, and the second the output normalization. This should only be used to mix "pair" and "joint" normalization schemes.

Value

An embedding method, to be passed as an argment to the method parameter of sneer.

References

Cook, J., Sutskever, I., Mnih, A., & Hinton, G. E. (2007). Visualizing similarity data with a mixture of maps. In International Conference on Artificial Intelligence and Statistics (pp. 67-74).

Hinton, G. E., & Roweis, S. T. (2002). Stochastic neighbor embedding. In Advances in neural information processing systems (pp. 833-840).

Kitazono, J., Grozavu, N., Rogovschi, N., Omori, T., & Ozawa, S. (2016, October). t-Distributed Stochastic Neighbor Embedding with Inhomogeneous Degrees of Freedom. In International Conference on Neural Information Processing (ICONIP 2016) (pp. 119-128). Springer International Publishing.

Lee, J. A., Renard, E., Bernard, G., Dupont, P., & Verleysen, M. (2013). Type 1 and 2 mixtures of Kullback-Leibler divergences as cost functions in dimensionality reduction based on similarity preservation. Neurocomputing, 112, 92-108.

Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(2579-2605).

Venna, J., Peltonen, J., Nybo, K., Aidos, H., & Kaski, S. (2010). Information retrieval perspective to nonlinear dimensionality reduction for data visualization. Journal of Machine Learning Research, 11, 451-490.

Yang, Z., King, I., Xu, Z., & Oja, E. (2009). Heavy-tailed symmetric stochastic neighbor embedding. In Advances in neural information processing systems (pp. 2169-2177).

Yang, Z., Peltonen, J., & Kaski, S. (2014). Optimization equivalence of divergences improves neighbor embedding. In Proceedings of the 31st International Conference on Machine Learning (ICML-14) (pp. 460-468).

See Also

For literature embedding methods, sneer will generate the method for you, by passing its name (e.g. method = "tsne"). This function is only strictly necessary for experimentation purposes.

Examples

# t-SNE
embedder(cost = "kl", kernel = "t-dist", norm = "joint")

# NeRV
embedder(cost = "nerv", kernel = "exp", norm = "point")

# JSE
embedder(cost = "JS", kernel = "exp", norm = "point")

# weighted SSNE
embedder(cost = "kl", kernel = "exp", norm = "joint", importance_weight = TRUE)

# SSNE where the input probabilities are averaged, but output probabilites
# are not. This only has an effect if the kernel parameters are set to be
# non-uniform.
embedder(cost = "kl", kernel = "exp", norm = c("joint", "pair"))

# MDS
embedder(cost = "square", transform = "none", kernel = "none", norm = "none")

# un-normalized version of t-SNE
embedder(cost = "kl", kernel = "t-dist", norm = "none")

## Not run: 
# Pass result of calling embedder to the sneer function's method parameter
sneer(iris, method = embedder(cost = "kl", kernel = "t-dist", norm = "joint"))

## End(Not run)

jlmelville/sneer documentation built on Nov. 15, 2022, 8:13 a.m.