Fast calculation of the k-nearest neighbor distances for a dataset
represented as a matrix of points. The kNN distance is defined as the
distance from a point to its k nearest neighbor. The kNN distance plot
displays the kNN distance of all points sorted from smallest to largest. The
plot can be used to help find suitable parameter values for
1 2 3
the data set as a matrix of points (Euclidean distance is used) or a precalculated dist object.
number of nearest neighbors used for the distance calculation.
should a matrix with the distances to all k nearest neighbors be returned?
further arguments (e.g., kd-tree related parameters) are passed
kNNdist() returns a numeric vector with the distance to its k
nearest neighbor. If
all = TRUE then a matrix with k columns
containing the distances to all 1st, 2nd, ..., kth nearest neighbors is
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
data(iris) iris <- as.matrix(iris[, 1:4]) ## Find the 4-NN distance for each observation (see ?kNN ## for different search strategies) kNNdist(iris, k = 4) ## Get a matrix with distances to the 1st, 2nd, ..., 4th NN. kNNdist(iris, k = 4, all = TRUE) ## Produce a k-NN distance plot to determine a suitable eps for ## DBSCAN with MinPts = 5. Use k = 4 (= MinPts -1). ## The knee is visible around a distance of .7 kNNdistplot(iris, k = 4) cl <- dbscan(iris, eps = .7, minPts = 5) pairs(iris, col = cl$cluster + 1L) ## Note: black points are noise points
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.