knn.select: Search for the Optimal K-nearest Neighbours

View source: R/knn.select.R

knn.selectR Documentation

Search for the Optimal K-nearest Neighbours

Description

This function search for the optimal number of neighbours for the given data set for k-nearest neighbour cross-validatory classification.

Usage

knn.select(object, crossval = "indiv")

Arguments

object

an object of class morphodata.

crossval

crossvalidation mode, sets individual ("indiv"; default, one-leave-out method) or whole populations ("pop") as leave-out unit.

Details

The knn.select function compute number of correctly classified individuals for k values ranging from 1 to 30 and highlight the value with the highest success rate. Ties (i.e., when there are the same numbers of votes for two or more groups) are broken at random, and thus several iterations may yield different results. Therefore, the functions compute 10 iterations, and the average success rates for each k are used; the minimum and maximum success rates for each k are also displayed as error bars. Note that several k values may have nearly the same success rates; if this is the case, the similarity of iterations may also be considered.

The mode of crossvalidation is set by the parameter crossval. The default "indiv" uses the standard one-leave-out method. However, as some hierarchical structure is usually present in the data (individuals from a population are not completely independent observations, as they are morphologically closer to each other than to individuals from other populations), the value "pop" sets whole populations as leave-out units. The latter method does not allow classification if there is only one population for a taxon and is more sensitive to “atypical” populations, which usually leads to a somewhat lower classification success rate.

Value

Optimal number of neighbours is written to the console, and plot displaying all Ks is produced.

See Also

classif.lda, classifSample.lda, classif.qda, classifSample.qda, classif.knn, classifSample.knn

Examples

data(centaurea)
centaurea = naMeanSubst(centaurea)
centaurea = removePopulation(centaurea, populationName = c("LIP", "PREL"))

# classification by nonparametric k-nearest neighbour method
knn.select(centaurea, crossval = "indiv")
classifRes.knn = classif.knn(centaurea, k = 12, crossval = "indiv")

MorphoTools2 documentation built on March 7, 2023, 6:18 p.m.