lsh | R Documentation |

An implementation of approximate k-nearest-neighbor search with locality-sensitive hashing (LSH). Given a set of reference points and a set of query points, this will compute the k approximate nearest neighbors of each query point in the reference set; models can be saved for future use.

lsh( bucket_size = NA, hash_width = NA, input_model = NA, k = NA, num_probes = NA, projections = NA, query = NA, reference = NA, second_hash_size = NA, seed = NA, tables = NA, true_neighbors = NA, verbose = FALSE )

`bucket_size` |
The size of a bucket in the second level hash. Default value "500" (integer). |

`hash_width` |
The hash width for the first-level hashing in the LSH preprocessing. By default, the LSH class automatically estimates a hash width for its use. Default value "0" (numeric). |

`input_model` |
Input LSH model (LSHSearch). |

`k` |
Number of nearest neighbors to find. Default value "0" (integer). |

`num_probes` |
Number of additional probes for multiprobe LSH; if 0, traditional LSH is used. Default value "0" (integer). |

`projections` |
The number of hash functions for each tabl. Default value "10" (integer). |

`query` |
Matrix containing query points (optional) (numeric matrix). |

`reference` |
Matrix containing the reference dataset (numeric matrix). |

`second_hash_size` |
The size of the second level hash table. Default value "99901" (integer). |

`seed` |
Random seed. If 0, 'std::time(NULL)' is used. Default value "0" (integer). |

`tables` |
The number of hash tables to be used. Default value "30" (integer). |

`true_neighbors` |
Matrix of true neighbors to compute recall with (the recall is printed when -v is specified) (integer matrix). |

`verbose` |
Display informational messages and the full list of parameters and timers at the end of execution. Default value "FALSE" (logical). |

This program will calculate the k approximate-nearest-neighbors of a set of points using locality-sensitive hashing. You may specify a separate set of reference points and query points, or just a reference set which will be used as both the reference and query set.

A list with several components:

`distances` |
Matrix to output distances into (numeric matrix). |

`neighbors` |
Matrix to output neighbors into (integer matrix). |

`output_model` |
Output for trained LSH model (LSHSearch). |

mlpack developers

# For example, the following will return 5 neighbors from the data for each # point in "input" and store the distances in "distances" and the neighbors # in "neighbors": ## Not run: output <- lsh(k=5, reference=input) distances <- output$distances neighbors <- output$neighbors ## End(Not run) # The output is organized such that row i and column j in the neighbors # output corresponds to the index of the point in the reference set which is # the j'th nearest neighbor from the point in the query set with index i. # Row j and column i in the distances output file corresponds to the distance # between those two points. # # Because this is approximate-nearest-neighbors search, results may be # different from run to run. Thus, the "seed" parameter can be specified to # set the random seed. # # This program also has many other parameters to control its functionality; # see the parameter-specific documentation for more information.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.