optimize_graph_layout | R Documentation |
Carry out dimensionality reduction on an input graph, where the distances in
the low dimensional space attempt to reproduce the neighbor relations in the
input data. By default, the cost function used to optimize the output
coordinates use the Uniform Manifold Approximation and Projection (UMAP)
method (McInnes et al., 2018), but the approach from LargeVis (Tang et al.,
2016) can also be used. This function can be used to produce a low
dimensional representation of the graph produced by
similarity_graph
.
optimize_graph_layout(
graph,
X = NULL,
n_components = 2,
n_epochs = NULL,
learning_rate = 1,
init = "spectral",
init_sdev = NULL,
spread = 1,
min_dist = 0.01,
repulsion_strength = 1,
negative_sample_rate = 5,
a = NULL,
b = NULL,
method = "umap",
approx_pow = FALSE,
pcg_rand = TRUE,
fast_sgd = FALSE,
n_sgd_threads = 0,
grain_size = 1,
verbose = getOption("verbose", TRUE),
batch = FALSE,
opt_args = NULL,
epoch_callback = NULL,
pca_method = NULL,
binary_edge_weights = FALSE
)
graph |
A sparse, symmetric N x N weighted adjacency matrix representing a graph. Non-zero entries indicate an edge between two nodes with a given edge weight. There can be a varying number of non-zero entries in each row/column. |
X |
Optional input data. Used only for PCA-based initialization. |
n_components |
The dimension of the space to embed into. This defaults
to |
n_epochs |
Number of epochs to use during the optimization of the
embedded coordinates. By default, this value is set to |
learning_rate |
Initial learning rate used in optimization of the coordinates. |
init |
Type of initialization for the coordinates. Options are:
For spectral initializations, ( |
init_sdev |
If non- |
spread |
The effective scale of embedded points. In combination with
|
min_dist |
The effective minimum distance between embedded points.
Smaller values will result in a more clustered/clumped embedding where
nearby points on the manifold are drawn closer together, while larger
values will result on a more even dispersal of points. The value should be
set relative to the |
repulsion_strength |
Weighting applied to negative samples in low dimensional embedding optimization. Values higher than one will result in greater weight being given to negative samples. |
negative_sample_rate |
The number of negative edge/1-simplex samples to use per positive edge/1-simplex sample in optimizing the low dimensional embedding. |
a |
More specific parameters controlling the embedding. If |
b |
More specific parameters controlling the embedding. If |
method |
Cost function to optimize. One of:
|
approx_pow |
If |
pcg_rand |
If |
fast_sgd |
If |
n_sgd_threads |
Number of threads to use during stochastic gradient
descent. If set to > 1, then be aware that if |
grain_size |
The minimum amount of work to do on each thread. If this
value is set high enough, then less than |
verbose |
If |
batch |
If |
opt_args |
A list of optimizer parameters, used when
|
epoch_callback |
A function which will be invoked at the end of every
epoch. Its signature should be:
|
pca_method |
Method to carry out any PCA dimensionality reduction when
the
|
binary_edge_weights |
If |
A matrix of optimized coordinates.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://arxiv.org/abs/1412.6980
McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction arXiv preprint arXiv:1802.03426. https://arxiv.org/abs/1802.03426
O’Neill, M. E. (2014). PCG: A family of simple fast space-efficient statistically good algorithms for random number generation (Report No. HMC-CS-2014-0905). Harvey Mudd College.
Tang, J., Liu, J., Zhang, M., & Mei, Q. (2016, April). Visualizing large-scale and high-dimensional data. In Proceedings of the 25th International Conference on World Wide Web (pp. 287-297). International World Wide Web Conferences Steering Committee. https://arxiv.org/abs/1602.00370
iris30 <- iris[c(1:10, 51:60, 101:110), ]
# return a 30 x 30 sparse matrix with similarity data based on 10 nearest
# neighbors per item
iris30_sim_graph <- similarity_graph(iris30, n_neighbors = 10)
# produce 2D coordinates replicating the neighbor relations in the similarity
# graph
set.seed(42)
iris30_opt <- optimize_graph_layout(iris30_sim_graph, X = iris30)
# the above two steps are the same as:
# set.seed(42); iris_umap <- umap(iris30, n_neighbors = 10)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.