Description Usage Arguments Details Value Author(s) See Also Examples
Build a HNSW index and save it to file in preparation for a nearest-neighbors search.
1 2 3 |
X |
A numeric matrix where rows correspond to data points and columns correspond to variables (i.e., dimensions). |
transposed |
Logical scalar indicating whether |
nlinks |
Integer scalar specifying the number of bi-directional links for each element. |
ef.construction |
Integer scalar specifying the size of the dynamic list during index construction. |
directory |
String containing the path to the directory in which to save the index file. |
ef.search |
Integer scalar specifying the size of the dynamic list to use during neighbor searching. |
fname |
String containing the path to the index file. |
distance |
String specifying the type of distance to use. |
This function is automatically called by findHnsw
and related functions.
However, it can be called directly by the user to save time if multiple queries are to be performed to the same X
.
It is advisable to change directory
to a location that is amenable to parallel read operations on HPC file systems.
Of course, if index files are manually constructed, the user is also responsible for their clean-up after all calculations are completed.
Larger values of nlinks
improve accuracy at the expense of speed and memory usage.
Larger values of ef.construction
improve index quality at the expense of indexing time.
The value of ef.search
controls the accuracy of the neighbor search at run time (i.e., not during the indexing itself).
Larger values improve accuracy at the expense of a slower search.
Note that this is always lower-bounded at k
, the number of nearest neighbors to identify.
Technically, the index construction algorithm is stochastic but, for various logistical reasons, the seed is hard-coded into the C++ code. This means that the results of the HNSW neighbor searches will be fully deterministic for the same inputs, even though the theory provides no such guarantees.
A HnswIndex object containing:
path
, a string containing the path to the index file.
data
, a numeric matrix equivalent to t(X)
.
NAMES
, a character vector or NULL
equal to rownames(X)
.
distance
, a string specifying the distance metric used.
Aaron Lun
See HnswIndex
for details on the output class.
See findHnsw
and queryHnsw
for dependent functions.
1 2 3 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.