NMSlib: Non metric space library

Description Usage Details Methods Methods References Examples

Description

Non metric space library

Non metric space library

Usage

1
2
3
4
# init <- NMSlib$new(input_data, Index_Params = NULL, Time_Params = NULL,
#                           space='l1', space_params = NULL, method = 'hnsw',
#                           data_type = 'DENSE_VECTOR', dtype = 'FLOAT',
#                           index_filepath = NULL, print_progress = FALSE)

Details

input_data parameter : In case of numeric data the input_data parameter should be either an R matrix object or a scipy sparse matrix. Additionally, the input_data parameter can be a list including more than one matrices / sparse-matrices having the same number of columns ( this is ideal for instance if the user wants to include both a train and a test dataset in the created index )

the Knn_Query function finds the approximate K nearest neighbours of a vector in the index

the knn_Query_Batch Performs multiple queries on the index, distributing the work over a thread pool

the save_Index function saves the index to disk

If the index_filepath parameter is not NULL then an existing index will be loaded

Methods

NMSlib$new(input_data, Index_Params = NULL, Time_Params = NULL, space='l1', space_params = NULL, method = 'hnsw', data_type = 'DENSE_VECTOR', dtype = 'FLOAT', index_filepath = NULL, print_progress = FALSE)
--------------
Knn_Query(query_data_row, k = 5)
--------------
knn_Query_Batch(query_data, k = 5, num_threads = 1)
--------------
save_Index(filename)

Methods

Public methods


Method new()

Usage
NMSlib$new(
  input_data,
  Index_Params = NULL,
  Time_Params = NULL,
  space = "l1",
  space_params = NULL,
  method = "hnsw",
  data_type = "DENSE_VECTOR",
  dtype = "FLOAT",
  index_filepath = NULL,
  print_progress = FALSE
)
Arguments
input_data

the input data. See details for more information

Index_Params

a list of (optional) parameters to use in indexing (when creating the index)

Time_Params

a list of parameters to use in querying. Setting Time_Params to NULL will reset

space

a character string (optional). The metric space to create for this index. Page 31 of the manual (see references) explains all available inputs

space_params

a list of (optional) parameters for configuring the space. See the references manual for more details.

method

a character string specifying the index method to use

data_type

a character string. One of 'DENSE_UINT8_VECTOR', 'DENSE_VECTOR', 'OBJECT_AS_STRING' or 'SPARSE_VECTOR'

dtype

a character string. Either 'FLOAT' or 'INT'

index_filepath

a character string specifying the path to a file, where an existing index is saved

print_progress

a boolean (either TRUE or FALSE). Whether or not to display progress bar


Method Knn_Query()

Usage
NMSlib$Knn_Query(query_data_row, k = 5)
Arguments
query_data_row

a vector to query for

k

an integer. The number of neighbours to return


Method knn_Query_Batch()

Usage
NMSlib$knn_Query_Batch(query_data, k = 5, num_threads = 1)
Arguments
query_data

the query_data parameter should be of the same type with the input_data parameter. Queries to query for

k

an integer. The number of neighbours to return

num_threads

an integer. The number of threads to use


Method save_Index()

Usage
NMSlib$save_Index(filename)
Arguments
filename

a character string specifying the path. The filename to save ( in case of the save_Index method ) or the filename to load ( in case of the load_Index method )


Method clone()

The objects of this class are cloneable with this method.

Usage
NMSlib$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

References

https://github.com/nmslib/nmslib/blob/master/manual/latex/manual.pdf

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
if (reticulate::py_available() && reticulate::py_module_available("nmslib")) {

  library(nmslibR)

  set.seed(1)
  x = matrix(runif(1000), nrow = 100, ncol = 10)

  init_nms = NMSlib$new(input_data = x)


  # returns a 1-dimensional vector (index, distance)
  #--------------------------------------------------

  init_nms$Knn_Query(query_data_row = x[1, ], k = 5)


  # returns knn's for all data
  #---------------------------

  all_dat = init_nms$knn_Query_Batch(x, k = 5, num_threads = 1)

}

nmslibR documentation built on March 13, 2020, 2 a.m.