knn: k-Nearest-Neighbors Search

View source: R/knn.R

knnR Documentation

k-Nearest-Neighbors Search

Description

An implementation of k-nearest-neighbor search using single-tree and dual-tree algorithms. Given a set of reference points and query points, this can find the k nearest neighbors in the reference set of each query point using trees; trees that are built can be saved for future use.

Usage

knn(
  algorithm = NA,
  epsilon = NA,
  input_model = NA,
  k = NA,
  leaf_size = NA,
  query = NA,
  random_basis = FALSE,
  reference = NA,
  rho = NA,
  seed = NA,
  tau = NA,
  tree_type = NA,
  true_distances = NA,
  true_neighbors = NA,
  verbose = FALSE
)

Arguments

algorithm

Type of neighbor search: 'naive', 'single_tree', 'dual_tree', 'greedy'. Default value "dual_tree" (character).

epsilon

If specified, will do approximate nearest neighbor search with given relative error. Default value "0" (numeric).

input_model

Pre-trained kNN model (KNNModel).

k

Number of nearest neighbors to find. Default value "0" (integer).

leaf_size

Leaf size for tree building (used for kd-trees, vp trees, random projection trees, UB trees, R trees, R* trees, X trees, Hilbert R trees, R+ trees, R++ trees, spill trees, and octrees). Default value "20" (integer).

query

Matrix containing query points (optional) (numeric matrix).

random_basis

Before tree-building, project the data onto a random orthogonal basis. Default value "FALSE" (logical).

reference

Matrix containing the reference dataset (numeric matrix).

rho

Balance threshold (only valid for spill trees). Default value "0.7" (numeric).

seed

Random seed (if 0, std::time(NULL) is used). Default value "0" (integer).

tau

Overlapping size (only valid for spill trees). Default value "0" (numeric).

tree_type

Type of tree to use: 'kd', 'vp', 'rp', 'max-rp', 'ub', 'cover', 'r', 'r-star', 'x', 'ball', 'hilbert-r', 'r-plus', 'r-plus-plus', 'spill', 'oct'. Default value "kd" (character).

true_distances

Matrix of true distances to compute the effective error (average relative error) (it is printed when -v is specified) (numeric matrix).

true_neighbors

Matrix of true neighbors to compute the recall (it is printed when -v is specified) (integer matrix).

verbose

Display informational messages and the full list of parameters and timers at the end of execution. Default value "FALSE" (logical).

Details

This program will calculate the k-nearest-neighbors of a set of points using kd-trees or cover trees (cover tree support is experimental and may be slow). You may specify a separate set of reference points and query points, or just a reference set which will be used as both the reference and query set.

Value

A list with several components:

distances

Matrix to output distances into (numeric matrix).

neighbors

Matrix to output neighbors into (integer matrix).

output_model

If specified, the kNN model will be output here (KNNModel).

Author(s)

mlpack developers

Examples

# For example, the following command will calculate the 5 nearest neighbors
# of each point in "input" and store the distances in "distances" and the
# neighbors in "neighbors": 

## Not run: 
output <- knn(k=5, reference=input)
neighbors <- output$neighbors
distances <- output$distances

## End(Not run)

# The output is organized such that row i and column j in the neighbors
# output matrix corresponds to the index of the point in the reference set
# which is the j'th nearest neighbor from the point in the query set with
# index i.  Row j and column i in the distances output matrix corresponds to
# the distance between those two points.

mlpack documentation built on Sept. 27, 2023, 1:07 a.m.

Related to knn in mlpack...