big.knn: The k-NN algorithm for really lage scale data
In Rfast2: A Collection of Efficient and Extremely Fast R Functions II

The k-NN algorithm for really lage scale data

R Documentation

The k-NN algorithm for really lage scale data

Description

The k-NN algorithm for really lage scale data.

Usage

big.knn(xnew, y, x, k = 2:100, type = "C")

Arguments

`xnew`	A matrix with new data, new predictor variables whose response variable must be predicted.
`y`	A vector of data. The response variable, which can be either continuous or categorical (factor is acceptable).
`x`	A matrix with the available data, the predictor variables.
`k`	A vector with the possible numbers of nearest neighbours to be considered.
`type`	If your response variable y is numerical data, then this should be "R" (regression). If y is in general categorical set this argument to "C" (classification).

Details

The concept behind k-NN is simple. Suppose we have a matrix with predictor variables and a vector with the response variable (numerical or categorical). When a new vector with observations (predictor variables) is available, its corresponding response value, numerical or categorical, is to be predicted. Instead of using a model, parametric or not, one can use this ad hoc algorithm.

The k smallest distances between the new predictor variables and the existing ones are calculated. In the case of regression, the average, median, or harmonic mean of the corresponding response values of these closest predictor values are calculated. In the case of classification, i.e. categorical response value, a voting rule is applied. The most frequent group (response value) is where the new observation is to be allocated.

This function allows for the Euclidean distance only.

Value

A matrix whose number of columns is equal to the size of k. If in the input you provided there is just one value of k, then a matrix with one column is returned containing the predicted values. If more than one value was supplied, the matrix will contain the predicted values for every value of k.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Friedman J., Hastie T. and Tibshirani R. (2017). The elements of statistical learning. New York: Springer.

Cover TM and Hart PE (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory. 13(1):21-27.

Examples

x <- as.matrix(iris[1:100, 1:2])
mod <- big.knn(xnew = x, y = iris[1:100, 5], x = x, k = c(6, 7) )

Rfast2 documentation built on June 8, 2025, 11:46 a.m.

Rfast2 index

Package overview README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Rfast2
A Collection of Efficient and Extremely Fast R Functions II

big.knn: The k-NN algorithm for really lage scale data
In Rfast2: A Collection of Efficient and Extremely Fast R Functions II

The k-NN algorithm for really lage scale data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to big.knn in Rfast2...

R Package Documentation

Browse R Packages

We want your feedback!

Rfast2 A Collection of Efficient and Extremely Fast R Functions II

big.knn: The k-NN algorithm for really lage scale data In Rfast2: A Collection of Efficient and Extremely Fast R Functions II

The k-NN algorithm for really lage scale data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to big.knn in Rfast2...

R Package Documentation

Browse R Packages

We want your feedback!

Rfast2
A Collection of Efficient and Extremely Fast R Functions II

big.knn: The k-NN algorithm for really lage scale data
In Rfast2: A Collection of Efficient and Extremely Fast R Functions II