View source: R/knn_index_dist.R
knn.index.dist | R Documentation |
This function returns the k nearest indices and distances of each observation
knn.index.dist( data, TEST_data = NULL, k = 5, method = "euclidean", transf_categ_cols = F, threads = 1, p = k )
data |
a data.frame or matrix |
TEST_data |
a data.frame or matrix (it can be also NULL) |
k |
an integer specifying the k-nearest-neighbors |
method |
a string specifying the method. Valid methods are 'euclidean', 'manhattan', 'chebyshev', 'canberra', 'braycurtis', 'pearson_correlation', 'simple_matching_coefficient', 'minkowski' (by default the order 'p' of the minkowski parameter equals k), 'hamming', 'mahalanobis', 'jaccard_coefficient', 'Rao_coefficient' |
transf_categ_cols |
a boolean (TRUE, FALSE) specifying if the categorical columns should be converted to numeric or to dummy variables |
threads |
the number of cores to be used in parallel (openmp will be employed) |
p |
a numeric value specifying the 'minkowski' order, i.e. if 'method' is set to 'minkowski'. This parameter defaults to 'k' |
This function takes a number of arguments and it returns the indices and distances of the k-nearest-neighbors for each observation. If TEST_data is NULL then the indices-distances for the train data will be returned, whereas if TEST_data is not NULL then the indices-distances for the TEST_data will be returned.
a list of length 2. The first sublist returns the indices and the second the distances of the k nearest neighbors for each observation. If TEST_data is NULL the number of rows of each sublist equals the number of rows in the train data. If TEST_data is not NULL the number of rows of each sublist equals the number of rows in the TEST data.
Lampros Mouselimis
data(Boston) X = Boston[, -ncol(Boston)] out = knn.index.dist(X, TEST_data = NULL, k = 4, method = 'euclidean', threads = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.