HnswParam: The HnswParam class

View source: R/HnswParam.R

HnswParamR Documentation

The HnswParam class

Description

A class to hold parameters for the HNSW algorithm for approximate nearest neighbor identification.

Usage

HnswParam(
  nlinks = 16,
  ef.construction = 200,
  ef.search = 10,
  distance = c("Euclidean", "Manhattan", "Cosine")
)

## S4 method for signature 'HnswParam'
defineBuilder(BNPARAM)

Arguments

nlinks

Integer scalar, number of bi-directional links per element for index generation.

ef.construction

Integer scalar, size of the dynamic list for index generation.

ef.search

Integer scalar, size of the dynamic list for neighbor searching.

distance

String specifying the distance metric to use. Cosine distances are implemented as Euclidean distances on L2-normalized coordinates.

BNPARAM

A HsnwParam instance.

Details

In the HNSW algorithm (Malkov and Yashunin, 2016), each point is a node in a “nagivable small world” graph. The nearest neighbor search proceeds by starting at a node and walking through the graph to obtain closer neighbors to a given query point. Nagivable small world graphs are used to maintain connectivity across the data set by creating links between distant points. This speeds up the search by ensuring that the algorithm does not need to take many small steps to move from one cluster to another. The HNSW algorithm extends this idea by using a hierarchy of such graphs containing links of different lengths, which avoids wasting time on small steps in the early stages of the search where the current node position is far from the query.

Larger values of nlinks improve accuracy at the expense of speed and memory usage. Larger values of ef.construction improve index quality at the expense of indexing time. The value of ef.search controls the accuracy of the neighbor search at run time, where larger values improve accuracy at the expense of a slower search.

Technically, the index construction algorithm is stochastic but, for various logistical reasons, the seed is hard-coded into the C++ code. This means that the results of the HNSW neighbor searches will be fully deterministic for the same inputs, even though the theory provides no such guarantees.

Value

The HnswParam constructor returns an instance of the HnswParam class.

The defineBuilder method returns an external pointer that can be used in buildIndex to construct a HNSW index.

Author(s)

Aaron Lun

See Also

BiocNeighborParam, for the parent class and its available methods.

https://github.com/nmslib/hnswlib, for details on the underlying algorithm.

Examples

(out <- HnswParam())
out[['nlinks']]

out[['nlinks']] <- 20L
out


LTLA/kmknn documentation built on Oct. 18, 2024, 9:01 p.m.