benchmark_annoy_vs_rcppannoy: Benchmark bigANNOY against direct RcppAnnoy

View source: R/benchmark_interface.R

benchmark_annoy_vs_rcppannoyR Documentation

Benchmark bigANNOY against direct RcppAnnoy

Description

Run the same Annoy build and search task through bigANNOY and through a direct dense RcppAnnoy baseline. The comparison reports both speed metrics and data-volume metrics such as reference bytes, query bytes, and generated index size.

Usage

benchmark_annoy_vs_rcppannoy(
  x = NULL,
  query = NULL,
  n_ref = 2000L,
  n_query = 200L,
  n_dim = 20L,
  k = 10L,
  n_trees = 50L,
  metric = "euclidean",
  search_k = -1L,
  seed = 42L,
  build_seed = seed,
  build_threads = -1L,
  block_size = annoy_default_block_size(),
  backend = getOption("bigANNOY.backend", "cpp"),
  exact = TRUE,
  filebacked = FALSE,
  path_dir = tempdir(),
  keep_files = FALSE,
  output_path = NULL,
  load_mode = "eager"
)

Arguments

x

Optional benchmark reference input. Supply NULL to generate a synthetic reference matrix, or provide a numeric matrix, big.matrix, descriptor, descriptor path, or external pointer.

query

Optional benchmark query input. Supply NULL for self-search, or provide a numeric matrix, big.matrix, descriptor, descriptor path, or external pointer.

n_ref

Number of synthetic reference rows to generate when x = NULL.

n_query

Number of synthetic query rows to generate when x = NULL and query is not NULL.

n_dim

Number of synthetic columns to generate when x = NULL.

k

Number of neighbours to return.

n_trees

Number of Annoy trees to build.

metric

Annoy metric. One of "euclidean", "angular", "manhattan", or "dot".

search_k

Annoy search budget.

seed

Random seed used for synthetic data generation and, by default, for the Annoy build seed.

build_seed

Optional Annoy build seed. Defaults to seed.

build_threads

Native Annoy build-thread setting.

block_size

Build/search block size.

backend

Requested bigANNOY backend.

exact

Logical flag controlling whether to benchmark the exact Euclidean baseline with bigKNN when available.

filebacked

Logical flag; if TRUE, synthetic or dense reference inputs are converted into file-backed big.matrix objects before build.

path_dir

Directory where temporary Annoy and optional file-backed benchmark files should be written.

keep_files

Logical flag; if TRUE, leave the generated Annoy index on disk after the benchmark finishes.

output_path

Optional CSV path where the benchmark summary should be written.

load_mode

Whether the benchmarked index should be returned metadata-only until first search ("lazy") or eagerly loaded once built ("eager").

Value

A list with a two-row summary data frame, one row for bigANNOY and one for direct RcppAnnoy, plus benchmark metadata and any validation report produced for the bigANNOY index.


bigANNOY documentation built on April 1, 2026, 9:07 a.m.