findMutualNN: Find mutual nearest neighbors

View source: R/findMutualNN.R

findMutualNNR Documentation

Find mutual nearest neighbors

Description

Find mutual nearest neighbors (MNN) across two data sets.

Usage

findMutualNN(
  data1,
  data2,
  k1,
  k2 = k1,
  BNINDEX1 = NULL,
  BNINDEX2 = NULL,
  BNPARAM = KmknnParam(),
  BPPARAM = SerialParam()
)

Arguments

data1

A numeric matrix containing points in the rows and variables/dimensions in the columns.

data2

A numeric matrix like data1 for another dataset with the same variables/dimensions.

k1

Integer scalar specifying the number of neighbors to search for in data1.

k2

Integer scalar specifying the number of neighbors to search for in data2.

BNINDEX1

A BiocNeighborIndex object containing a pre-built index for data1.

BNINDEX2

A BiocNeighborIndex object containing a pre-built index for data2.

BNPARAM

A BiocNeighborParam object specifying the neighbour search algorithm to use. This should be consistent with the class of BNINDEX1 and BNINDEX2, if either are specified.

BPPARAM

A BiocParallelParam object specifying how parallelization should be performed.

Details

For each point in dataset 1, the set of k2 nearest points in dataset 2 is identified. For each point in dataset 2, the set of k1 nearest points in dataset 1 is similarly identified. Two points in different datasets are considered to be part of an MNN pair if each point lies in the other's set of neighbors. This concept allows us to identify matching points across datasets, which is useful for, e.g., batch correction.

Any values for the BNINDEX1 and BNINDEX2 arguments should be equal to the output of buildIndex for the respective matrices, using the algorithm specified with BNPARAM. These arguments are only provided to improve efficiency during repeated searches on the same datasets (e.g., for comparisons between all pairs). The specification of these arguments should not, generally speaking, alter the output of the function.

Value

A list containing the integer vectors first and second, containing row indices from data1 and data2 respectively. Corresponding entries in first and second specify a MNN pair consisting of the specified rows from each matrix.

Author(s)

Aaron Lun

See Also

queryKNN for the underlying neighbor search code.

fastMNN and related functions from the batchelor package, from which this code was originally derived.

Examples

B1 <- matrix(rnorm(10000), ncol=50) # Batch 1 
B2 <- matrix(rnorm(10000), ncol=50) # Batch 2
out <- findMutualNN(B1, B2, k1=20)
head(out$first)
head(out$second)


LTLA/kmknn documentation built on Feb. 5, 2024, 6:03 p.m.