kNN: Find the k Nearest Neighbors

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/kNN.R

Description

This function uses a kd-tree to find all k nearest neighbors in a data matrix (including distances) fast.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
kNN(
  x,
  k,
  query = NULL,
  sort = TRUE,
  search = "kdtree",
  bucketSize = 10,
  splitRule = "suggest",
  approx = 0
)

## S3 method for class 'kNN'
sort(x, decreasing = FALSE, ...)

## S3 method for class 'kNN'
adjacencylist(x, ...)

Arguments

x

a data matrix, a dist object or a kNN object.

k

number of neighbors to find.

query

a data matrix with the points to query. If query is not specified, the NN for all the points in x is returned. If query is specified then x needs to be a data matrix.

sort

sort the neighbors by distance? Note that some search methods already sort the results. Sorting is expensive and sort = FALSE may be much faster for some search methods. kNN objects can be sorted using sort().

search

nearest neighbor search strategy (one of "kdtree", "linear" or "dist").

bucketSize

max size of the kd-tree leafs.

splitRule

rule to split the kd-tree. One of "STD", "MIDPT", "FAIR", "SL_MIDPT", "SL_FAIR" or "SUGGEST" (SL stands for sliding). "SUGGEST" uses ANNs best guess.

approx

use approximate nearest neighbors. All NN up to a distance of a factor of 1 + approx eps may be used. Some actual NN may be omitted leading to spurious clusters and noise points. However, the algorithm will enjoy a significant speedup.

decreasing

sort in decreasing order?

...

further arguments

Details

Ties: If the kth and the (k+1)th nearest neighbor are tied, then the neighbor found first is returned and the other one is ignored.

Self-matches: If no query is specified, then self-matches are removed.

Details on the search parameters:

Value

An object of class kNN (subclass of NN) containing a list with the following components:

dist

a matrix with distances.

id

a matrix with ids.

k

number k used.

Author(s)

Michael Hahsler

References

David M. Mount and Sunil Arya (2010). ANN: A Library for Approximate Nearest Neighbor Searching, http://www.cs.umd.edu/~mount/ANN/.

See Also

Other NN functions: NN, comps(), frNN(), kNNdist(), sNN()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
data(iris)
x <- iris[, -5]

# Example 1: finding kNN for all points in a data matrix (using a kd-tree)
nn <- kNN(x, k = 5)
nn

# explore neighborhood of point 10
i <- 10
nn$id[i,]
plot(x, col = ifelse(1:nrow(iris) %in% nn$id[i,], "red", "black"))

# visualize the 5 nearest neighbors
plot(nn, x)

# visualize a reduced 2-NN graph
plot(kNN(nn, k = 2), x)

# Example 2: find kNN for query points
q <- x[c(1,100),]
nn <- kNN(x, k = 10, query = q)

plot(nn, x, col = "grey")
points(q, pch = 3, lwd = 2)

# Example 3: find kNN using distances
d <- dist(x, method = "manhattan")
nn <- kNN(d, k = 1)
plot(nn, x)

dbscan documentation built on Jan. 16, 2022, 1:06 a.m.