randomProjectionTreeSearch: Find approximate k-Nearest Neighbors using random projection...

Description Usage Arguments Details Value

Description

A fast and accurate algorithm for finding approximate k-nearest neighbors.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
  tree_threshold = max(10, nrow(x)), max_iter = 1,
  distance_method = "Euclidean", seed = NULL, threads = NULL,
  verbose = getOption("verbose", TRUE))

## S3 method for class 'matrix'
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
  tree_threshold = max(10, nrow(x)), max_iter = 1,
  distance_method = "Euclidean", seed = NULL, threads = NULL,
  verbose = getOption("verbose", TRUE))

## S3 method for class 'CsparseMatrix'
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
  tree_threshold = max(10, nrow(x)), max_iter = 1,
  distance_method = "Euclidean", seed = NULL, threads = NULL,
  verbose = getOption("verbose", TRUE))

## S3 method for class 'TsparseMatrix'
randomProjectionTreeSearch(x, K = 150, n_trees = 50,
  tree_threshold = max(10, nrow(x)), max_iter = 1,
  distance_method = "Euclidean", seed = NULL, threads = NULL,
  verbose = getOption("verbose", TRUE))

Arguments

x

A (potentially sparse) matrix, where examples are columnns and features are rows.

K

How many nearest neighbors to seek for each node.

n_trees

The number of trees to build.

tree_threshold

The threshold for creating a new branch. The paper authors suggest using a value equivalent to the number of features in the input set.

max_iter

Number of iterations in the neighborhood exploration phase.

distance_method

One of "Euclidean" or "Cosine."

seed

Random seed passed to the C++ functions. If seed is not NULL (the default), the maximum number of threads will be set to 1 in phases that would be non-determinstic otherwise.

threads

The maximum number of threads to spawn. Determined automatically if NULL (the default).

verbose

Whether to print verbose logging using the progress package.

Details

Note that the algorithm does not guarantee that it will find K neighbors for each node. A warning will be issued if it finds fewer neighbors than requested. If the input data contains distinct partitionable clusters, try increasing the tree_threshold to increase the number of returned neighbors.

Value

A [K, N] matrix of the approximate K nearest neighbors for each vertex.


elbamos/largeVis documentation built on May 16, 2019, 2:58 a.m.