Fast k-nearest neighbor searching algorithms including a kd-tree, cover-tree and the algorithm implemented in class package.
1 2 |
data |
an input data matrix. |
query |
a query data matrix. |
algorithm |
nearest neighbor searching algorithm. |
k |
the maximum number of nearest neighbors to search. The default value is set to 10. |
The cover tree is O(n) space data structure which allows us to answer queries in the same O(log(n)) time as kd tree given a fixed intrinsic dimensionality. Templated code from http://hunch.net/~jl/projects/cover_tree/cover_tree.html is used.
The kd tree algorithm is implemented in the Approximate Near Neighbor (ANN) C++ library (see http://www.cs.umd.edu/~mount/ANN/). The exact nearest neighbors are searched in this package.
The CR algorithm is the VR using distance 1-x'y assuming x
and y
are unit vectors.
The brute algorithm searches linearly. It is a naive method.
a list contains:
nn.index |
an n x k matrix for the nearest neighbor indice. |
nn.dist |
an n x k matrix for the nearest neighbor Euclidean distances. |
Shengqiao Li. To report any bugs or suggestions please email: shli@stat.wvu.edu.
Bentley J.L. (1975), “Multidimensional binary search trees used for associative search,” Communication ACM, 18, 309-517.
Arya S. and Mount D.M. (1993), “Approximate nearest neighbor searching,” Proc. 4th Ann. ACM-SIAM Symposium on Discrete Algorithms (SODA'93), 271-280.
Arya S., Mount D.M., Netanyahu N.S., Silverman R. and Wu A.Y. (1998), “An optimal algorithm for approximate nearest neighbor searching,” Journal of the ACM, 45, 891-923.
Beygelzimer A., Kakade S. and Langford J. (2006), “Cover trees for nearest neighbor,” ACM Proc. 23rd international conference on Machine learning, 148, 97-104.
nn2
in RANN, ann
in yaImpute and knn
in class.
1 2 3 4 5 |
Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.