hnsw_build: Build an hnswlib nearest neighbor index

Description Usage Arguments Value Examples

View source: R/hnsw.R

Description

Build an hnswlib nearest neighbor index

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
hnsw_build(
  X,
  distance = "euclidean",
  M = 16,
  ef = 200,
  verbose = FALSE,
  progress = "bar",
  n_threads = 0,
  grain_size = 1
)

Arguments

X

a numeric matrix of data to add. Each of the n rows is an item in the index.

distance

Type of distance to calculate. One of:

  • "l2" Squared L2, i.e. squared Euclidean.

  • "euclidean" Euclidean.

  • "cosine" Cosine.

  • "ip" Inner product: 1 - sum(ai * bi), i.e. the cosine distance where the vectors are not normalized. This can lead to negative distances and other non-metric behavior.

M

Controls the number of bi-directional links created for each element during index construction. Higher values lead to better results at the expense of memory consumption. Typical values are 2 - 100, but for most datasets a range of 12 - 48 is suitable. Can't be smaller than 2.

ef

Size of the dynamic list used during construction. A larger value means a better quality index, but increases build time. Should be an integer value between 1 and the size of the dataset.

verbose

If TRUE, log messages to the console.

progress

If "bar" (the default), also log a progress bar when verbose = TRUE. There is a small but noticeable overhead (a few percent of run time) to tracking progress. Set progress = NULL to turn this off. Has no effect if verbose = FALSE.

n_threads

Maximum number of threads to use. The exact number is determined by grain_size.

grain_size

Minimum amount of work to do (rows in X to add) per thread. If the number of rows in X isn't sufficient, then fewer than n_threads will be used. This is useful in cases where the overhead of context switching with too many threads outweighs the gains due to parallelism.

Value

an instance of a HnswL2, HnswCosine or HnswIp class.

Examples

1
2
3
irism <- as.matrix(iris[, -5])
ann <- hnsw_build(irism)
iris_nn <- hnsw_search(irism, ann, k = 5)

Example output



RcppHNSW documentation built on Sept. 6, 2020, 9:06 a.m.