knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE, eval=FALSE)
The \pkg{Annoy} \proglang{C++} library
\citep{Github:annoy} implements a quick and simple method for approximate nearest neighbor
(oh yeah) searching. The \pkg{RcppAnnoy} package \citep{CRAN:RcppAnnoy} provides a
centralized resource for developers to use this code in their own \proglang{R} packages by
relying on \pkg{Rcpp} \citep{TAS:Rcpp,CRAN:Rcpp}. To use \pkg{Annoy} in \proglang{C++}
code, simply put in your DESCRIPTION
the line
LinkingTo: RcppAnnoy
and the header files will be available for inclusion into your package's source files. Note that \pkg{Annoy} is a header-only library so no additional commands are necessary for the linker.
Obviously, the header files need to be include
d in any \proglang{C++} source file that
uses \pkg{Annoy}. A few macros also need to be added to handle Windows-specific behaviour
and to ensure that error messages are printed through R. Version number
comparison macros help in conditioning changes on a particular version.
Since release 0.0.17 all this is now expressed centrally in a header in the
package so users can just use this one-liner:
```{Rcpp, eval=FALSE}
# Defining the search type The `AnnoyIndex` template class can accommodate different data types, distance metrics, random number generators, and threading policies (where the latter are a choice between sequential or multithreaded). Here, we will consider the most common application of a nearest-neighbor search on floating-point data with Euclidean distance. We `typedef` the type and realized template for convenience: ```{Rcpp, eval=FALSE} typedef float ANNOYTYPE; typedef Annoy::AnnoyIndex<int, ANNOYTYPE, Annoy::Euclidean, Kiss64Random, RcppAnnoyIndexThreadPolicy> MyAnnoyIndex;
Note that we use float
by default, rather than the more conventional double
.
This is chosen for speed and to be consistent with the original Python implementation.
The \pkg{Annoy} library uses random number generation during index creation
(via the Kiss64Random
class), with a seed that is separate from R's RNG seed.
By default, the seed is fixed and results will be "deterministic" in the sense
that repeated runs on the same data will yield the same result. They will also be
unresponsive to the state of R's RNG seed. The seed used by AnnoyIndex
can be
specified by the set_seed
method, which should be called before adding items
to the index.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.