Description Usage Arguments Details Value
Implementation of a Cover Tree algorithm for a specified set of data
1 |
data |
a data.frame containing the data to be covered |
dist.func |
a function that computes the distance between 2 entry rows in the data |
The implementation here is based on the rules and insertion algorithm defined in Alina Beygelzimer, Sham Kakade, and John Langford: Cover Trees for Nearest Neighbor. (Web) The current implementation, however, negates the levels outlined in that algorithm.
All nodes in level i are implicitly in level i+1
All pairs of nodes in level i have a separation of more than 1/(2^i)
All nodes in level i+1 have exactly one parent in level i that is no farther away than 1/(2^i)
The provided data must be a data.frame with at least two rows of unique data. For the purpose of the CoverTree algorithm, unique data means having a non-zero separation. Redundant data will not be added to the cover tree.
The provided distance function must take two rows from the input data as arguments and must return a distance. The returned distance must:
be non-negative
be symmetric: dist(a,b) = dist(b,a)
obey the triangle rule: dist(a,b) + dist(b,c) >= dist(a,c)
The provided distance function, may takeoptional parameters. If it does, these parameter must also be provided to cover tree constructor in the format recognized by the distance function. These parameter will be passed to every invocation of the distance function.
The columns defined in the data.frame are irrelevant to the CoverTree algorithm, but must be consistent with the distance function.
An initialized CoverTree as described above
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.