getknn: kNN selection

View source: R/getknn.R

getknnR Documentation

kNN selection

Description

Function getknn selects the k nearest neighbours of each row observation of a new (= test) data set within a reference (= training) data set, based on a dissimilarity measure.

getknn uses function get.knnx of package FNN (Beygelzimer et al.) available on CRAN.

Usage


getknn(Xr, Xu, k = NULL,
  diss = c("euclidean", "mahalanobis", "correlation"), 
  algorithm = "brute", list = TRUE)

Arguments

Xr

A n x p matrix or data frame of reference (training) observations.

Xu

A m x p matrix or data frame of new (test) observations.

k

The number of nearest neighbors to select.

diss

The type of dissimilarity used between observations. Possible values are "euclidean" (default; Euclidean distance), "mahalanobis" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)).

algorithm

Search algorithm used for Euclidean and Mahalanobis distances. Default to "brute". See get.knnx.

list

If TRUE (default), a list format is also returned for the outputs.

Value

A list of outputs, such as:

nn

A n x k data frame with the row numbers of the neighbors for the observations.Row "i" = indexes of the k nearest neighbors of the test observation "i".

d

A n x k data frame with the dissimilarities of the neighbors for the observations. Row "i" = dissimilarities of the k nearest neighbors of the test observation "i".

listnn

Same as $nn but in a list format.

listd

Same as $d but in a list format.

Examples


data(datcass)

Xr <- datcass$Xr
Xu <- datcass$Xu

k <- 5
getknn(Xr, Xu[1:3, ], k = k)

z <- pca(Xr, Xu, ncomp = 15)
Tr <- z$Tr
Tu <- z$Tu[1:10, ]
k <- 5
getknn(Tr, Tu, k = k, diss = "mahalanobis")



mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.