queryDistance: Distance to the k-th nearest neighbor to query points

queryDistanceR Documentation

Distance to the k-th nearest neighbor to query points


Query a reference dataset to determine the distance to the k-th nearest neighbor of each point in a query dataset.


  num.threads = 1,
  subset = NULL,
  transposed = FALSE,



The reference dataset to be queried. This should be a numeric matrix where rows correspond to reference points and columns correspond to variables (i.e., dimensions). Alternatively, a prebuilt BiocNeighborIndex object from buildIndex.


A numeric matrix of query points, containing the same number of columns as X.


A positive integer scalar specifying the number of nearest neighbors to retrieve.

Alternatively, an integer vector of length equal to the number of points in query, specifying the number of neighbors to identify for each point. If subset is provided, this should have length equal to the length of subset. Users should wrap this vector in an AsIs class to distinguish length-1 vectors from integer scalars.

All k should be less than or equal to the number of points in X, otherwise the former will be capped at the latter with a warning.


Integer scalar specifying the number of threads to use for the search.


An integer, logical or character vector indicating the rows of query (or columns, if transposed=TRUE) for which the nearest neighbors should be identified.


A logical scalar indicating whether X and query are transposed, in which case both matrices are assumed to contain dimensions in the rows and data points in the columns.


Further arguments to pass to buildIndex when X is not an external pointer.


A BiocNeighborParam object specifying how the index should be constructed. If NULL, this defaults to a KmknnParam. Ignored if x contains a prebuilt index.


If multiple queries are to be performed to the same X, it may be beneficial to build the index from X with buildIndex. The resulting pointer object can be supplied as X to multiple queryKNN calls, avoiding the need to repeat index construction in each call.


Numeric vector of length equal to the number of points in query (or subset, if provided), containing the distance from each point to its k-th nearest neighbor. This is equivalent to but more memory efficient than using queryKNN and subsetting to the last distance.


Aaron Lun

See Also

buildIndex, to build an index ahead of time.


Y <- matrix(rnorm(100000), ncol=20)
Z <- matrix(rnorm(20000), ncol=20)
out <- queryDistance(Y, query=Z, k=5)

LTLA/BiocNeighbors documentation built on Dec. 12, 2024, 7:45 a.m.