| rpf_knn | R Documentation | 
Returns the approximate k-nearest neighbor graph of a dataset by searching multiple random projection trees, a variant of k-d trees originated by Dasgupta and Freund (2008).
rpf_knn(
  data,
  k,
  metric = "euclidean",
  use_alt_metric = TRUE,
  n_trees = NULL,
  leaf_size = NULL,
  max_tree_depth = 200,
  include_self = TRUE,
  ret_forest = FALSE,
  margin = "auto",
  n_threads = 0,
  verbose = FALSE,
  obs = "R"
)
| data | Matrix of  | 
| k | Number of nearest neighbors to return. Optional if  | 
| metric | Type of distance calculation to use. One of: 
 For non-sparse data, the following variants are available with preprocessing: this trades memory for a potential speed up during the distance calculation. Some minor numerical differences should be expected compared to the non-preprocessed versions: 
 For non-sparse binary data passed as a  
 Note that if  | 
| use_alt_metric | If  | 
| n_trees | The number of trees to use in the RP forest. A larger number
will give more accurate results at the cost of a longer computation time.
The default of  | 
| leaf_size | The maximum number of items that can appear in a leaf. The
default of  | 
| max_tree_depth | The maximum depth of the tree to build (default = 200).
If the maximum tree depth is exceeded then the leaf size of a tree may
exceed  | 
| include_self | If  | 
| ret_forest | If  | 
| margin | A character string specifying the method used to assign points to one side of the hyperplane or the other. Possible values are: 
 | 
| n_threads | Number of threads to use. | 
| verbose | If  | 
| obs | set to  | 
the approximate nearest neighbor graph as a list containing:
idx an n by k matrix containing the nearest neighbor indices.
dist an n by k matrix containing the nearest neighbor distances.
forest (if ret_forest = TRUE) the RP forest that generated the
neighbor graph, which can be used to query new data.
k neighbors per observation are not guaranteed to be found. Missing data
is represented with an index of 0 and a distance of NA.
Dasgupta, S., & Freund, Y. (2008, May). Random projection trees and low dimensional manifolds. In Proceedings of the fortieth annual ACM symposium on Theory of computing (pp. 537-546). \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1145/1374376.1374452")}.
rpf_filter(), nnd_knn()
# Find 4 (approximate) nearest neighbors using Euclidean distance
# If you pass a data frame, non-numeric columns are removed
iris_nn <- rpf_knn(iris, k = 4, metric = "euclidean", leaf_size = 3)
# If you want to initialize another method (e.g. nearest neighbor descent)
# with the result of the RP forest, then it's more efficient to skip
# evaluating whether an item is a neighbor of itself by setting
# `include_self = FALSE`:
iris_rp <- rpf_knn(iris, k = 4, n_trees = 3, include_self = FALSE)
# for future querying you may want to also return the RP forest:
iris_rpf <- rpf_knn(iris,
  k = 4, n_trees = 3, include_self = FALSE,
  ret_forest = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.