findDistance: Distance to the k-th nearest neighbor

findDistanceR Documentation

Distance to the k-th nearest neighbor

Description

Find the distance to the k-th nearest neighbor for each point in a dataset.

Usage

findDistance(X, k, num.threads = 1, subset = NULL, ..., BNPARAM = NULL)

Arguments

X

A numeric matrix where rows correspond to data points and columns correspond to variables (i.e., dimensions). Alternatively, a prebuilt BiocNeighborIndex object from buildIndex.

k

A positive integer scalar specifying the number of nearest neighbors to retrieve.

Alternatively, an integer vector of length equal to the number of points in X, specifying the number of neighbors to identify for each point. If subset is provided, this should have length equal to the length of subset. Users should wrap this vector in an AsIs class to distinguish length-1 vectors from integer scalars.

All k should be less than or equal to the number of points in X minus 1, otherwise the former will be capped at the latter with a warning.

num.threads

Integer scalar specifying the number of threads to use for the search.

subset

An integer, logical or character vector specifying the indices of points in X for which the nearest neighbors should be identified. This yields the same result as (but is more efficient than) subsetting the output matrices after computing neighbors for all points.

...

Further arguments to pass to buildIndex when X is not an external pointer.

BNPARAM

A BiocNeighborParam object specifying how the index should be constructed. If NULL, this defaults to a KmknnParam. Ignored if x contains a prebuilt index.

Details

If multiple queries are to be performed to the same X, it may be beneficial to build the index from X with buildIndex. The resulting pointer object can be supplied as X to multiple findDistance calls, avoiding the need to repeat index construction in each call.

Value

Numeric vector of length equal to the number of points in X (or subset, if provided), containing the distance from each point to its k-th nearest neighbor. This is equivalent to but more memory efficient than using findKNN and subsetting to the last distance.

Author(s)

Aaron Lun

See Also

buildIndex, to build an index ahead of time.

Examples

Y <- matrix(rnorm(100000), ncol=20)
out <- findDistance(Y, k=8)
summary(out)


LTLA/kmknn documentation built on Oct. 18, 2024, 9:01 p.m.